Hardening Firefox with Anthropic's Red Team

520 points by todsacerdoti 16 hours ago

The bugs are the ones that say "using Claude from Anthropic" here: https://www.mozilla.org/en-US/security/advisories/mfsa2026-1...

https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...

https://www.wsj.com/tech/ai/send-us-more-anthropics-claude-s...

tabbott - 8 hours ago

I recommend that anyone who is responsible for maintaining the security of an open-source software project that they maintain ask Claude Code to do a security audit of it. I imagine that might not work that well for Firefox without a lot of care, because it's a huge project.

But for most other projects, it probably only costs $3 worth of tokens. So you should assume the bad guys have already done it to your project looking for things they can exploit, and it no longer feels responsible to not have done such an audit yourself.

Something that I found useful when doing such audits for Zulip's key codebases is the ask the model to carefully self-review each finding; that removed the majority of the false positives. Most of the rest we addressed via adding comments that would help developers (or a model) casually reading the code understand what the intended security model is for that code path... And indeed most of those did not show up on a second audit done afterwards.

SV_BubbleTime - 3 hours ago

This is exactly how I would not recommend AI to be used.
“do a thing that would take me a week” can not actually be done in seconds. It will provide results that resemble reality superficially.
If you were to pass some module in and ask for finite checks on that, maybe.
Despite the claims of agents… treat it more like an intern and you won’t be disappointed.
Would you ask an intern to “do a security audit” of an entire massive program?
- creatonez - 2 hours ago
  
  IMO the key behavior is that LLMs are really good at fuzz testing, because they are probabilistic monkeys on typewriters that are much more code-aware than a conventional fuzz tester. They cannot produce a comprehensive security audit or fix security issues in a reliable way without human oversight, but they sure can come up with dumb inputs that break the code.
  The results of such AI fuzz testing should be treated as just a science experiment and not a replacement for the entire job of a security researcher.
  Like conventional fuzz testing, you get the best results if you have a harness to guide it towards interesting behaviors, a good scientific filtering process to confirm something is really going wrong, a way to reduce it to a minimal test case suitable for inclusion in a test suite, and plenty of human followup to narrow in on what's going on and figure out what correctness even means in the particular domain the software is made for.
- padolsey - 2 hours ago
  
  My approach is that, "you may as well" hammer Claude and get it to brute-force-investigate your codebase; worst case, you learn nothing and get a bunch of false-positive nonsense. Best case, you get new visibility into issues. Of _course_ you should be doing your own in-depth audits, but the plain fact is that people do not have time, or do not care sufficiently. But you can set up a battery of agents to do this work for you. So.. why not?
- eli - an hour ago
  
  It depends whether anyone was ever actually going to spend that week doing it the "hard" way. Having Claude do it in a few minutes beats doing nothing.
  Put another way: I absolutely would have an intern work on a security audit. I would not have an intern replace a professional audit though.
  It's otherwise a pretty low stakes use. I'd expect false positives to be pretty obvious to someone maintaining the code.
  - SV_BubbleTime - an hour ago
    
    My point is that it’s one thing to say I want my intern to start doing a security audit.
    It’s another thing to say hey intern security audit this entire code base.
    LLM’s thrive on context. You need the right context at the right time, it doesn’t matter how good your model is if you don’t have that.
Analemma_ - 8 hours ago

I'm curious: has someone done a lengthy write-up of best practices to get good results out of AI security audits? It seems like it can go very well (as it did here) or be totally useless (all the AI slop submitted to HackerOne), and I assume the difference comes down to the quality of your context engineering and testing harnesses.
This post did a little bit of that but I wish it had gone into more detail.
- j-conn - 4 hours ago
  
  OpenAI just released “codex security”, worth trying (along with other suggestions) if your org has access https://openai.com/index/codex-security-now-in-research-prev...
- simonw - 7 hours ago
  
  The HackerOne slop is because there's a financial incentive (bug bounties) involved, which means people who don't know what they are doing blindly submit anything that an LLM spots for them.
  If you're running the security audit yourself you should be in a better position to understand and then confirm the issues that the coding agents highlight. Don't treat something as a security issue until you can confirm that it is indeed a vulnerability. Coding agents can help you put that together but shouldn't be treated as infallible oracles.
  - hansvm - 3 hours ago
    
    That sounds like the same problem (a deluge of slop) with a different interface (eating straight from the trough rather than waiting for someone to put a bow on it and stamp their name to it)?
    
    simonw - 3 hours ago
    
    I've found it's pretty good. It's really not that much of a burden to dig through 10 reports and find the 2 that are legitimate.
    It's different from Hacker One because those reports tend to come in with all sorts of flowery language added (or prompt-added) by people who don't know what they are doing.
    If you're running the prompts yourself against your own coding agents you gain much more control over the process. You can knock each report down to just a couple of sentences which is much faster to review.
    
    Mapsmithy - 2 hours ago
    
    You also probably have a much better idea of where the unsafe boundaries in your application are. Letting the models know this information up front has given me a dozen or so legitimate vulnerabilities in the application I work on. And the signal to noise ratio is generally pretty good. Certainly orders of magnitude better than the terrible dependabot alerts I have to dismiss every day
  - johannes1234321 - 6 hours ago
    
    The question still is: will enough useful stuff be included, to make it worth to dig through the slop? And how to tune the prompt to get better results.
    
    simonw - 6 hours ago
    
    Best way to figure that out is to try it and see what happens.
    
    Groxx - 5 hours ago
    
    [claimed common problem exists, try X to find it] -> [Q about how to best do that] -> "the best way to do it is to do it yourself"
    Surely people have found patterns that work reasonably well, and it's not "everyone is completely on their own"? I get that the scene is changing fast, but that's ridiculous.
    
    simonw - 4 hours ago
    
    There's so much superstition and outdated information out there that "try it yourself" really is good advice.
    You can do that in conjunction with trying things other people report, but you'll learn more quickly from your own experiments. It's not like prompting a coding agent is expensive or time consuming, for the most part.
    
    nl - 5 hours ago
    
    /security-review really is pretty good.
    But your codebase is unique. Slop in one codebase is very dangerous in another.
    
    bluGill - 6 hours ago
    
    That depends on how the tool is used. People who ask for a security vulnerability get slop. People who asked for deeper analysis often get something useful - but it isn't always a vulnerability.
    
    unethical_ban - 4 hours ago
    
    I assume it's just like asking for help refactoring, just targeting specific kinds of errors.
    I ran a small python script that I made some years ago through an LLM recently and it pointed out several areas where the code would likely throw an error if certain inputs were received. Not security, but flaws nonetheless.
    
    ronsor - 5 hours ago
    
    You're either digging through slop or digging through your whole codebase anyway.
- lmeyerov - 7 hours ago
  
  We split our work:
  * Specification extraction. We have security.md and policy.md, often per module. Threat model, mechanisms, etc. This is collaborative and gets checked in for ourselves and the AI. Policy is often tricky & malleable product/business/ux decision stuff, while security is technical layers more independent of that or broader threat model.
  * Bug mining. It is driven by the above. It is iterative, where we keep running it to surface findings, adverserially analyze them, and prioritize them. We keep repeating until diminishing returns wrt priority levels. Likely leads to policy & security spec refinements. We use this pattern not just for security , but general bugs and other iterative quality & performance improvement flows - it's just a simple skill file with tweaks like parallel subagents to make it fast and reliable.
  This lets the AI drive itself more easily and in ways you explicitly care about vs noise
- ares623 - 8 hours ago
  
  No mention of the quality of the engineers reviewing the result?

mmsc - 15 hours ago

It's cool that Mozilla updated https://www.mozilla.org/en-US/security/advisories/mfsa2026-1... because we were all wondering who had found 22 vulnerabilities in a single release (their findings were originally not attributed to anybody.)

himata4113 - 5 hours ago

Use After Free Use After Free Use After Free Use After Free Use After Free Use After Free Use After Free.
I would be more satisfied if they gave a proper explanation of what these could have lead to rather than being "well maybe 0.001% chance to exploit this". They did vaguely go over how "two" exploits managed to drop a file, but how impactful is that? Dropping a file in abcd with custom contents in some folder relative to the user profile is not that impactful other than corrupting data or poisoning cache, injecting some javascript. Now reading session data from other sites, that I would find interesting.
- mccr8 - an hour ago
  
  You should generally assume that in a web browser any memory corruption bug can, when combined with enough other bugs and a lot of clever engineering, be turned into arbitrary code execution on your computer.
  - himata4113 - an hour ago
    
    The most important bit being the difficulty, AI finding 21 easily exploitable bugs is a lot more interesting than 21 that you need all the planets to align to work.
- hedora - 4 hours ago
  
  If you can poison cache, you can probably use that a stepping stone to read session data from other sites.
dmix - 6 hours ago

Looks like a lot of the usual suspects

gzoo - 3 hours ago

This resonates. I just open-sourced a project and someone on Reddit ran a full security audit using Claude found 15 issues across the codebase including FTS injection, LIKE wildcard injection, missing API auth, and privacy enforcement gaps I'd missed entirely. What surprised me was how methodical it was. Not just "this looks unsafe" it categorized by severity, cited exact file paths and line numbers, and identified gaps between what the docs promised and what the code actually implemented. The "spec vs reality" analysis was the most useful part.

Makes me think the biggest impact of LLM security auditing isn't finding novel zero-days it's the mundane stuff that humans skip because it's tedious. Checking every error handler for information leakage, verifying that every documented security feature is actually implemented, scanning for injection points across hundreds of routes. That's exactly the kind of work that benefits from tireless pattern matching.

fcpk - 16 hours ago

The fact there is no mention of what were the bugs is a little odd. It'd really be nice to see if this is a "weird never happening edge case" or actual issues. LLMs have uncanny abilities to identify failure patterns that it has seen before, but they are not necessarily meaningful.

iosifache - 16 hours ago

You can find them linked [1] in the OG article from Anthropic [2].
[1] https://www.mozilla.org/en-US/security/advisories/mfsa2026-1...
[2] https://www.anthropic.com/news/mozilla-firefox-security
larodi - 15 hours ago

The fact that some of the Claude-discovered bugs were quite severe is also a little more than something to brush off as "yeah, LLM, whatever". The lists reads quite meaningful to me, but I'm not a security expert anyways.
jandem - 16 hours ago

Here's a write-up for one of the bugs they found: https://red.anthropic.com/2026/exploit/
deafpolygon - 16 hours ago

I’m guessing it might be some of these: https://www.mozilla.org/en-US/security/advisories/mfsa2026-1...
- muizelaar - 16 hours ago
  
  Yeah, the ones reported by Evyatar Ben Asher et al.
  - robin_reala - 15 hours ago
    
    I correctly misread that as “et AI”.
    
    moffkalast - 10 hours ago
    
    we can put that one next to the Weird AI Yankovic music generator.
    
    deafpolygon - 15 hours ago
    
    “et AI, Brutus!"
    
    tclancy - 14 hours ago
    
    Yon Claude has a lean and hungry look.
    
    nervysnail - 5 hours ago
    
    He computes too much.
    
    deafpolygon - 14 hours ago
    
    An LLM by any other name would hallucinate the same
    
    tclancy - 12 hours ago
    
    Anyone still reading down here will appreciate this https://bsky.app/profile/simeonthefool.bsky.social/post/3kbk...
    
    tclancy - 10 hours ago
    
    Hang on, someone downvoted me for a horrific pun? GOOD.