Mozilla says 271 vulnerabilities found by Mythos have "almost no false positives"

2026-05-07 · Source: AI - Ars Technica · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Software Development & Engineering · Depth: Advanced, short

Summary

Mozilla recently detailed its use of Anthropic Mythos, an AI model, to identify 271 security vulnerabilities in Firefox over two months, addressing prior skepticism about AI-assisted vulnerability detection. This breakthrough, which led to Mozilla's CTO declaring "zero-days are numbered," was attributed to improved AI models and Mozilla's development of a custom "agent harness." This harness guides Mythos through specific tasks, providing it with tools like file access and test case evaluation, mirroring human developer workflows. The system achieved "almost no false positives" by using a second LLM to verify the initial findings, providing high confidence in the reports. Of the discovered bugs, 180 were classified as sec-high, 80 as sec-moderate, and 11 as sec-low, demonstrating the system's effectiveness in finding significant flaws.

Key takeaway

For security engineering leaders evaluating AI for vulnerability detection, Mozilla's experience with Anthropic Mythos and a custom agent harness demonstrates a viable path to significantly reduce false positives and scale discovery. Your teams should investigate developing similar harnesses that integrate AI models directly into existing development and testing toolchains, focusing on deterministic task verification and multi-stage AI validation to achieve high confidence in reported flaws.

Key insights

An AI agent harness significantly reduces false positives in vulnerability detection by integrating LLMs into existing developer toolchains.

Principles

Deterministic task verification is crucial for AI agent success.
Dual-LLM verification enhances confidence in AI-generated reports.

Method

An agent harness wraps an LLM, providing instructions and tools (e.g., file I/O, test case evaluation) to guide it through vulnerability discovery, with a second LLM verifying the findings.

In practice

Integrate LLMs with existing testing pipelines.
Define clear success signals for automated verification.

Topics

AI-assisted Vulnerability Detection
Anthropic Mythos
Agent Harness
Firefox Security Flaws
False Positive Reduction

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, Software Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.