Skip to main content

How Mozilla Crushed False Positives While Finding 271 Firefox Bugs with Mythos AI

 

How Mozilla Crushed False Positives While Finding 271 Firefox Bugs with Mythos AI

How Mozilla Crushed False Positives While Finding 271 Firefox Bugs with Mythos AI

The "boy who cried wolf" problem has haunted vulnerability scanners for decades, until now. Here's the inside story of how Mozilla and Anthropic finally solved it.


When Mozilla's CTO stood up last month and declared that AI-assisted vulnerability detection meant "defenders finally have a chance to win, decisively," a lot of us rolled our eyes.

I mean, we've heard this story before, haven't we? Some shiny new AI tool promises to revolutionise security. It finds a few impressive-sounding bugs. The press runs with it. Then someone actually digs into the reports and finds that half the "vulnerabilities" were hallucinated, and the other half weren't exploitable outside a lab.

But this time, something is different. And the difference isn't just the AI model, it's how Mozilla wrapped it.

On Thursday, Mozilla's engineering team published a behind-the-scenes post detailing their two-month journey using Anthropic's Claude Mythos Preview to scan the Firefox codebase. The result? 271 real vulnerabilities fixed in Firefox 150, with, as distinguished engineer Brian Grinstead put it, "almost no false positives."

Let's unpack what actually happened, why it matters, and whether this is finally the real deal.


The Problem That Haunted AI Bug Hunters for Years

When "Unwanted Slop" Overwhelmed Security Teams

Mozilla's engineers have a wonderfully candid name for what previous AI bug-finding tools produced: unwanted slop.

Here is how it typically played out. Someone would feed a block of code into a large language model and say, "Find the vulnerabilities." The model would dutifully produce pages of plausible-sounding bug reports, often at a scale that seemed miraculous. Memory corruption here. Use-after-free there. Race conditions everywhere.

Then a human engineer would actually investigate.

And they would discover that a large percentage of the details had been hallucinated. The "exploit" was not reproducible. The vulnerable code path was unreachable. The "bug" relied on an API that did not even exist in that version of the software.

The result? Engineers spent more time debunking fake bugs than they did fixing real ones. It was, to put it mildly, a nightmare.

Traditional SAST Tools: 30-70% False Positives as the Norm

But let's not pretend this is an AI-only problem. Traditional Static Application Security Testing (SAST) tools, the kind enterprises have been buying for years, have their own dirty secret.

Industry research consistently shows that commercial SAST tools operate with false positive rates between 30% and 70%. A 2024 empirical study of C/C++ codebases reported that over 76% of alerts had nothing to do with real defects.

Think about that for a moment. For every ten alerts your SAST tool throws at you, somewhere between three and seven of them are complete fiction. And that is on a good day with a well-tuned deployment.

The Triage Tax, Why False Positives Are Actually Dangerous

False positives are not merely annoying. They are actively dangerous, and here's why.

When security teams are flooded with bogus alerts, something insidious happens: alert fatigue. Engineers stop trusting the tool. They start treating every alert with suspicion. And eventually, inevitably, a real vulnerability slips through because someone assumed it was just more noise.

It's the cybersecurity equivalent of a smoke alarm that goes off every time you make toast. Eventually, you take the batteries out.

This is the problem Mozilla claims to have solved. So how did they do it?


Enter Mythos, 271 Bugs, Almost No False Positives

The Numbers That Made the Industry Stop and Stare

Let's get the headline figures on the table, because they genuinely warrant attention.

Over approximately two months, Mozilla ran Claude Mythos Preview against the Firefox source code. The model identified 271 security vulnerabilities that were subsequently fixed and shipped in Firefox 150, with additional fixes landing in versions 149.0.2 and 150.0.1.

The severity breakdown tells its own story:

  • 180 vulnerabilities were rated sec-high — Mozilla's highest internal designation, meaning they could be exploited through normal user behaviour such as browsing to a web page.
  • 80 vulnerabilities were sec-moderate.
  • 11 vulnerabilities were sec-low.

When was the last time a security tool handed you 180 high-severity findings and the team did not lose a month triaging false alarms? That is the part that has security professionals leaning forward in their chairs.

From 22 Bugs (Opus 4.6) to 271 (Mythos), What Changed?

Context makes the number even more striking. Just a few months earlier, Mozilla ran Anthropic's Claude Opus 4.6 against Firefox 148. The result? Twenty-two security-sensitive bugs.

That is a 12x increase in findings between two generations of the same company's AI.

But here is the crucial point that most coverage misses: the model improvement was only half the story. The other half, arguably the more important half, was Mozilla's custom agent harness.

Sandbox Escapes and 15-Year-Old Sleeping Bugs

Among the 271 findings were some genuinely remarkable catches.

Mythos uncovered sandbox vulnerabilities — the holy grail of browser security research. Mozilla's bug bounty programme pays up to $20,000 for a single sandbox escape finding, the highest reward available. Despite that incentive, Grinstead says Mythos found more sandbox issues than human researchers ever did.

The model also surfaced bugs that had been dormant in the code for over a decade, including a 15-year-old error in how Firefox parses a particular HTML element. These are the kinds of bugs that survive countless code reviews, automated scans, and fuzzing campaigns, and then an AI spots them in a single pass.


The Secret Sauce, Inside Mozilla's Agent Harness Architecture

What Exactly Is an Agent Harness? (Simple Metaphor)

Alright, here is where we go deeper than any other coverage of this story.

Think of an agent harness like a racing engineer sitting next to a Formula 1 driver.

The driver (the AI model) has immense raw talent. But without someone managing the pit stops, tyre strategy, fuel levels, and track conditions, without someone giving real-time feedback, that talent is wasted. The driver might set one blistering lap and then spin into the gravel on the next corner.

The harness is the racing engineer. It wraps around the AI model, feeds it instructions, gives it tools, checks its work, and keeps it on track. It does not make the model smarter, it makes the model useful.

Step-by-Step: How the Pipeline Works

Here is exactly how Mozilla's harness guided Mythos through the Firefox codebase, step by step:

Step 1: The Prompt. The harness points Mythos at a specific source file and says, essentially: "We know there is an issue in this file. Please go find it." It is not asking the model to scan the entire codebase, it is directing it to a suspicious area identified through other means, such as fuzzing.

Step 2: Tool Access. The harness gives Mythos access to the exact same tools that human Mozilla developers use, including the special sanitizer build of Firefox designed specifically for memory-safety testing. Mythos can read files, write test cases, and submit them to Mozilla's existing fuzzing infrastructure.

Step 3: The Deterministic Win Condition. This is the part that made my ears perk up. Grinstead explained: "In our case when we are looking for memory safety issues, we have our sanitizer build of Firefox, and if you make it crash, you win. " That is a deterministic, unambiguous success signal. No subjectivity. No "this kinda looks like a bug." The harness either gets a crash, or it does not.

Step 4: The Second LLM as Judge and Jury. Even after the harness produces a finding, it does not go straight to a human. A second LLM grades the output from the first. If it gives a high score, meaning the report is coherent, reproducible, and consistent with known vulnerability patterns, only then does it reach a human engineer.

Step 5: Human-Readable Report with Reproducible Test Case. The final output includes the exact HTML or code that triggers the unsafe condition, meeting the same criteria Mozilla requires for all security bugs. Developers get a "crank they can pull", a reliable test that says "yes, the bug exists" and "yes, you fixed it."

Why Deterministic Verification Matters More Than Model Smarts

Here is a thought that might sound counter-intuitive: the model's intelligence was not the breakthrough. The verifiability of its outputs was.

For years, AI security tools have been stuck in a horrible loop: the model generates a report, a human investigates, and the human finds that the report was mostly wrong. The human then does the actual security research the old-fashioned way. The AI did not save time, it added time.

Mozilla's harness breaks this loop because the verification step is automatic and deterministic. The sanitizer either crashes or it does not. There is no middle ground. By the time a finding reaches a human engineer, it has already passed two layers of machine verification: the crash test and the second LLM review.

"These things are actually just suddenly very good," Grinstead told TechCrunch. "We see that on our own internal scanning, we see that on external bug reports, and we see that in all sorts of signals across the industry."

The Harness Is Custom-Built, Can Others Replicate It?

The honest answer: it depends on your resources.

Mozilla acknowledged that building a useful harness requires significant customisation to project-specific semantics, tooling, and processes. You cannot just download a generic harness off the shelf and point it at your codebase. You need to define your deterministic success signals, integrate your existing testing infrastructure, and tune the pipeline to your specific tech stack.

That said, the principle is replicable: agentic systems that can evaluate their own outputs and filter out poor results represent a turning point for the industry.

Mozilla's engineers were refreshingly transparent about this. "We are trying to get a message out about this technique in general," Grinstead said, "and not any specific model provider, company, or anything like that."


The Comparison Table Nobody Else Gave You

The Comparison Table Nobody Else Gave You

Addressing the Elephant in the Room, The Skepticism

The CVE Controversy, Why Only 3 of 271 Got a CVE Number

Sharp-eyed readers will have noticed something: Mozilla's official security advisory for Firefox 150 (MFSA 2026-30) lists 41 CVE entries, and only three of them individually credit Anthropic's team using Claude.

So where are the other 268?

Here is the explanation, and it is not as scandalous as some critics have suggested. Mozilla does not obtain CVE listings for internally discovered security bugs. This is standard practice for many large software organisations. Internally found flaws are bundled into a single patch, and the associated Bugzilla reports are typically hidden for several months after release to protect users who are slow to update.

Now that the patches have shipped, Mozilla has begun un-hiding Bugzilla reports, 12 so far, with more to come. At least one independent researcher who reviewed the initial batch described them as "pretty impressive."

"Any Elite Human Researcher Could Have Found These", So What?

Mozilla themselves acknowledged, commendably, I think, that none of the 271 bugs were beyond the capability of an elite human security researcher. Firebase CTO Bobby Holley wrote: "Encouragingly, we also have not seen any bugs that could not have been found by an elite human researcher."

Critics have seized on this as evidence that Mythos is overhyped. But that criticism misses the point entirely.

The breakthrough is not that Mythos can find bugs no human could ever find. The breakthrough is that it can find them at a fraction of the cost and time. Holley noted that using Mythos, "in many cases, it eliminated the need to concentrate months of expensive human effort just to find a single defect."

It is the difference between saying "a human could run 100 metres" and "a human could run 100 metres in under ten seconds." Yes, both are technically human achievements, but the second one is Usain Bolt. Mythos is not doing the impossible. It is doing the previously impractical at industrial scale.

Cherry-Picking Concerns and Mozilla's Response

There is one more criticism worth addressing: the concern that Mozilla is cherry-picking its best results while burying less impressive findings.

Mozilla's response is pragmatic. Grinstead said the team felt it was important to show their work precisely because the industry has been burned by "slop commits" over the past year. "There is no sort of marketing angle here," he said. "Our team has completely bought in on this approach."

The gradual un-hiding of Bugzilla reports will give the security community raw data to evaluate. That transparency, incomplete as it currently is, is more than we typically get from AI vendors making bold claims about their security capabilities.


What This Means for Security Teams (and Open-Source Maintainers)

Defenders Finally Have an Asymmetric Advantage

For most of cybersecurity history, attackers have had the advantage. They only need to find one vulnerability. Defenders need to find and fix all of them.

Tools like Mythos flip that asymmetry. When vulnerability discovery becomes cheap and systematic, defenders can find their own bugs before attackers do, and the economics swing in favour of the good guys.

Anthropic CEO Dario Amodei captured this optimism: "If we handle this right, we could be in a better position than we started, because we fixed all these bugs. There are only so many bugs to find."

Grinstead, to his credit, was more measured: "It is useful for both attackers and defenders, but having the tool available shifts the advantage a little bit to defence. Realistically, nobody knows the answer to this yet." I appreciate that honesty.

The Open-Source Vulnerability Debt Crisis

There is a darker side to this story that deserves attention. Mythos is not publicly available, it is restricted to a small group of organisations through Anthropic's Project Glasswing.

This creates a troubling dynamic: large corporations can use AI to find and fix vulnerabilities in their software, while open-source maintainers, the people whose code runs most of the internet, are left without access to the same tools.

Mozilla CTO Raffi Krikorian put it bluntly in the New York Times: "The programmer who has dedicated 20 years of their life to maintaining open-source code, code that runs inside products used by billions of people, does not yet have access to Mythos. But they should."

Practical Takeaways, What You Can Steal From Mozilla's Playbook

Even if you do not have access to Mythos, there are principles here you can apply today:

  1. Invest in deterministic testing infrastructure. The harness worked because Mozilla had a sanitizer build that produced unambiguous crash signals. Your test suite should do the same, if a test passes, you are safe; if it fails, you are not. No grey area.

  2. Build verification pipelines, not just scanning tools. A finding is useless until it is verified. Mozilla's two-layer verification (crash test + second LLM) is the real innovation. Think about how you can add automated verification to whatever scanning tool you use.

  3. Do not wait for the perfect model. The harness architecture means you can get value from today's models, then swap in better ones as they arrive. The harness is the moat; the model is just an engine you can upgrade.

  4. Transparency builds trust. Mozilla's decision to open Bugzilla reports, however partial, is a model for how to communicate AI-assisted findings without triggering the hype backlash.


A Turning Point, Not a Magic Wand

Here is what I keep coming back to.

In April 2025, Firefox shipped 31 bug fixes. In April 2026, after integrating the Mythos pipeline, it shipped 423.

That is not a marginal improvement. That is a step change.

Is Mythos perfect? No. Mozilla says the AI still cannot write deployable patches, every fix was written and reviewed by a human engineer. And while the false-positive problem appears to have been dramatically reduced, Grinstead's careful phrasing, "almost no false positives", leaves room for the reality that nothing is perfect.

But perfection has never been the standard. The standard is: does this make defenders meaningfully more effective than they were yesterday?

On that question, the evidence from Mozilla's experiment is hard to dismiss. The false-positive problem that has crippled vulnerability scanning for decades has not been fully solved, but it has been pushed to the point where scanning at scale is finally practical.

And that, genuinely, changes the game.

What Do You Think?

Are you experimenting with AI-assisted vulnerability detection in your own codebase? Have you run into the same "slop" problem Mozilla described, and found ways around it? I would love to hear what is working (and what is not) in the real world.

Drop a comment below or reach out on LinkedIn, the conversation about AI security is moving fast, and the best insights are coming from engineers in the trenches, not press releases.

Comments

Popular posts from this blog

Banks Warned About Anthropic’s Mythos AI: What It Means for Financial Security

  Banks Warned About Anthropic’s Mythos AI: What It Means for Financial Security It’s a regular Tuesday in Washington, D.C., or at least, that’s what it looked like from the outside. Inside the Treasury building, though, something unusual was happening. The U.S. Treasury Secretary and the Federal Reserve Chair had just summoned the CEOs of America’s biggest banks for an urgent, last-minute meeting. No press release. No advance notice. Just… get here. Now. The reason? A new AI model called Mythos, built by Anthropic, the company behind Claude, that regulators now consider a potential  systemic risk  to the entire financial system. Yeah. That’s not something you hear every day. The Emergency Meeting On Tuesday, April 7, 2026, Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell convened an unannounced gathering of Wall Street’s most powerful banking executives at the Treasury Department’s headquarters in Washington. The guest list read like a wh...

Thieves Are Drilling Holes in Gas Tanks: How to Protect Yourself from This Rising Crime

Thieves Are Drilling Holes in Gas Tanks: How to Protect Yourself from This Rising Crime Drill, Drain, and Disappear: The New Gas Theft Epidemic Every Driver Needs to Know About You're running late, you hop in your car, and the fuel gauge is on empty. "That's weird," you think. "I just filled up yesterday." You head to the gas station, start pumping, and then you hear it, a sound like a faucet running under your car. You look down, and your heart sinks. Gasoline is just gushing out onto the concrete. It's not a leaky hose; it's a perfectly round, deliberate hole drilled right into your fuel tank. That's exactly what happened to Tasi Malala, a driver in Arizona, and it's a nightmare scenario playing out in driveways and parking lots across the country. This isn't the old-school siphon of decades past. This is a brazen, fast, and incredibly destructive new gas theft technique that's spreading like wildfire. And with fuel prices spiking...

How Some Grocers Are Using AI to Cut Food Waste and Boost Profit Margins (Without Raising Prices)

How Some Grocers Are Using AI to Cut Food Waste and Boost Profit Margins (Without Raising Prices) The $473 Billion Problem Hiding in Plain Sight Walk into any grocery store and you'll see it. Perfectly stacked pyramids of avocados. Endless rows of yogurt cups. Bins overflowing with crisp apples. It all looks so...  abundant . But here's the uncomfortable truth nobody likes to talk about: about  30% of that food will never make it to anyone's kitchen table.  It'll spoil. Get tossed. End up in a dumpster. The numbers are staggering. Food waste costs U.S. retailers and consumers an estimated  $473 billion  each year. In grocery stores specifically, roughly 30% of food gets thrown away, translating to nearly  $18.2 billion in lost value  annually. And for an industry that scrapes by on  net margins of just 2-4% , every wilted lettuce leaf and expired yogurt cup is money literally rotting on the shelf. I remember talking to a produce manager at a mid...