AI is breaking two vulnerability cultures
jefftk.com144 points by speckx 4 hours ago
144 points by speckx 4 hours ago
This has been a very long time coming and the crackup we're starting to see was predicted long before anyone knew what an LLM is.
The catalyst is the shift towards software transparency: both the radically increased adoption of open source and source-available software, and the radically improved capabilities of reversing and decompilation tools. It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
This has been playing out in slow motion ever since BinDiff: you can't patch software without disclosing vulnerabilities. We've been operating in a state of denial about this, because there was some domain expertise involved in becoming a practitioner for whom patches were transparently vulnerability disclosures. But AIs have vaporized the pretense.
It is now the case that any time something gets merged into mainline Linux, several different organizations are feeding the diffs through LLM prompts aggressively evaluating whether they fix a vulnerability and generating exploit guidance. That will be the case for most major open source projects (nginx, OpenSSL, Postgres, &c) sooner rather than later.
The norms of coordinated disclosure are not calibrated for this environment. They really haven't been for the last decade.
I'm weirdly comfortable with this, because I think coordinated disclosure norms have always been blinkered, based on the unquestioned premise that delaying disclosure for the operational convenience of system administrators is a good thing. There are reasons to question that premise! The delay also keeps information out of the hands of system operators who have options other than applying patches.
> It has been over a decade since any ordinary off-the-shelf closed-source software was meaningfully obscured from serious adversaries.
Probably goes without saying but the last line of defense is not deploying your software publicly and instead relying on server-client architectures to do anything. Maybe this will be more common as vulnerabilities are more easily detected and exploited. Of course its not always feasible.
It has been annoying seeing my (proguard obfuscated) game client binaries decompiled and published on github many times over the last 11 years. Only the undeployed server code has remained private.
Interestingly I didn't have a problem with adversaries reverse engineering my network protocols until I was updating them less frequently than weekly. LLM assisted adversaries could probably keep up with that now too.
We have a huge problem.
The US is at war. Much of the world is at war at the cyber attack level right now. The US, the EU, most of the Middle East, Israel, Russia... Major services have been attacked and have gone down for days at a time - Ubuntu, Github, Let's Encrypt, Stryker. Entire hospital systems have had to partially shut down.
Now, in the middle of this, AI has made attacks much faster to generate. Faster than the defensive side can respond. Zero-day attacks used to be rare. Now they're normal.
It's going to get worse before it gets better. Maybe much worse.
> before it gets better
How is it going to get better?
If we assume that there will be an AI that is perfect in terms of ability to find vulnerabilities, cheap to run and widely available to everyone, then anyone can run it on any piece of software before deploying it. All vulnerabilities get found before they can be exploited.
One of the big challenges with cybersecurity is that attackers only need to find one exploit, while defenders need to stop everything. When you have a large surface area and limited resources, it's much easier to be the side that only has to succeed once. AI eliminates the limited resources problem.
Right now we are at a point in time when AI can find bugs for attackers and defenders, but defenders did not fix/find those bugs yet.
In time most of the bugs AI can find will be fixed, and things will calm down. Some bugs will be left, but will be too complex to find and weaponise (or rarely).
Alin short, attackers have advantage for a brief time now, but ultimately defenders will win. I guess this "fight" might be over before the end of the year.
I'd speculate that at this point Linux etc are probably having vulnerabilities discovered and patched faster than created.
Bulk rewrites of everything into Rust with AI assistance?
I am looking at the results of a mass vulnerability scan as I type this. Half of the bugs in one case are in fact (binary) parser errors for hand-written parsers. These really should not exist in any language - but in C it's particularly bad. Kaitai Struct or something similar would broadly have prevented these. Rust would help here, but less than a parser generator (because it could automate error checking insertion for things that aren't just out of bound access).
However, half of the vulnerabilities are logic errors in terms of what I would call RBAC enforcement, incorrect access permissions, and so on. Rust won't help at all with any of these.
I was just working on a system best thought of as a “dinosaur”: written almost entirely in C (and a bit of PERL) and running on an appliance with BSD as the kernel.
It’s full of bugs and has had a string of RCE vulnerabilities published recently, probably because of Mythos.
Working with it day to day I get this feeling that the tech stack used results in a system that’s… clumsy and constrained.
Little things give me that impression, and I can’t quite put it in words, but it’s thirty years of experience working with dozens of languages and platforms speaking here.
Using C makes you clumsy.
It makes you trip over things other languages don’t.
It makes it obscenely difficult to do even simple things. It’s like trying to put a delicate ship into a bottle while wearing oven mitts.
Switching to a better language isn’t just about the specific capabilities of its compiler, it’s also about what it enables in the humans using it.
This feels more like an old problem getting reframed as an AI problem.
people were already diffing kernel commits and figuring out which ones were security fixes long before llms. if a patch lands publicly, the race has basically already started.
also not sure shorter embargoes really help. the orgs that can patch in hours are already fine. everyone else still takes days or weeks.
if anything, cheaper exploit generation probably makes coordinated disclosure more important, not less.
> people were already diffing kernel commits and figuring out which ones were security fixes
With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
> not sure shorter embargoes really help
Why 90 days versus 2 years? The author is arguing the factors that set that balance have shifted, given the frequency of simultaneous discovery. The embargo window isn’t an actual window, just an illusion, if the exploit is going to be found by several people outside the embargo anyway.
> cheaper exploit generation probably makes coordinated disclosure more important
I agree. But it also makes it less viable. If script kiddies can find and exploit zero days, the capacity to co-ordinate breaks down.
There was always a guild ethic that drove white-hate (EDIT: hat) culture. If the guild is broken, the ethic has nothing to stand on.
> With skill, and usually not consistently and systematically.
How do you know? If the people who like to crow about vulnerabilities aren't doing it, it doesn't mean that the people who are actually in a position to exploit them systematically and effectively aren't doing it.
Those embargoes have always been dangerous, because they create a false sense of security. But, as you point out...
> With AI, anyone can do this to any software.
Yep. Even if it hadn't been true before, it's clear that now you just have to assume that everybody relevant will immediately recognize the security impact of any patch that gets published. That includes both bugs fixed and bugs introduced.
... and as the AI gets better, you're going to have to assume that you don't even have to publish a patch. Or source code. Within way less time than it's going to take people to admit it and adjust, any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
>any vulnerability in any software available for inspection is going to be instant public knowledge. Or at least public among anybody who matters.
Shouldn't this naturally lead to a state where all (new) code is vulnerability-free? If AI vulnerability detection friction becomes low enough it'll become common/forced practice to pre-scan code.
Finding a vulnerability by looking at the diff that fixed it is very different than just looking through the code.
> it'll become common/forced practice to pre-scan code.
You'd think.
But then you'd think people would do a lot of other things too. I hope, I guess.
The other danger is that "the cloud" may become even more overwhelmingly dominant. Which of course has its own large security costs.
> How do you know?
We know because we could see the effects of the average rate of vulnerabilities discovery and exploitation, and it's definitely going up very fast. Until recently, vulnerabilities were relatively hard to find, and finding them was done by a very restricted group of people world-wide, which made them quite valuable. Not any more.
That's correlation, not causation.
It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped. The bigger target makes for the easier discovery.
But don't we know that some of the vulnerabilities being discovered predate ai coding?
Certainly, and some discoveries have been attributed to AI (I was reading that mozilla firefox were praising mythos recently)
But that's not accounting for all of the discoveries, not at all.
I've also seen the npm people talking about the surge in AI code overwhelming the ability to properly review what's being distributed, and a large number of vulnerabilities being attributed to that
> That's correlation, not causation.
Pragmatically, correlation *is* evidence of causation in favour of the best explanation, until somebody finds a better explanation.
> It could equally be argued that the AI slop that's being produced makes for a lot more vulnerabilities being shipped.
This is also true, and does not exclude the other, because for the moment the vast majority of production software in the world (and therefore the bulk of enticing targets) was written before AI. If LLM software will become prevalent in commercial setups, then LLM-generated code will eventually become the majority of targets.
> Pragmatically, correlation is evidence of causation in favour of the best explanation, until somebody finds a better explanation.
Uh, no.
Correlation is only ever one thing - cause for investigation.
Everything based on correlation alone is speculation.
You can speculate all you like, I have zero issue with that, but that's best prefaced with "I guess"
edit: Science captures this perfectly, and people misunderstand this so fundamentally that there is a massive debate where people who think they are "pro science" argue this so badly with theists that they completely hoist themselves with their own petard.
Science uses the term "theory" because all of our understanding is based on "available data" - and science biggest contribution to humanity is that it accepts that the current/leading THEORY can and will be retracted if there is compelling data discovered that demonstrates a falsehood.
So - because I know this is coming - yes science is willing to accept some correlation - BUT it's labelled "theory" or "statistically significant" because science is clear that if other data arises then that idea will need to be revisited.
Very often you only have limited time for investigation and you have to act now. Action is almost always based on educated guesses.
You have moved from "We know" to "We have an educated guess" which is the right way to couch things.
However I wanted to also point out that relying only on educated guesses can lead us into a position where we are "papering over the cracks" or "addressing the symptoms", not the "underlying cause"
Yes, sometimes that's all that can be done, but, also, sometimes it can be more damaging than the cause itself (thinking in terms of the cause continuing to fester away, whilst we think it's 'solved')
> people were already diffing kernel commits and figuring out which ones were security fixes With skill, and usually not consistently and systematically. With AI, anyone can do this to any software.
I would like to see actual evidence of this, not.. vibes
I mean, this reeks of "Anyone is a Principal developer now" when the truth is there is still work to do.
I haven't been keeping tabs for the entirety of Linux development, but has it ever happened before that someone dropped a working exploit from the mailing list before the patch even hit the kernel?
I haven't seen this kind of thing and I get the impression, despite all the hype, that this will be a frequent phenomenon now thanks to LLMs.
> Torvalds said that disclosing the bug itself was enough, without the pursuant circus that followed when a major problem has been discovered. [1]
So it's not surprising Dirtyfrag was disclosed by a fix in the Linux kernel. [2]
[1] https://www.zdnet.com/article/torvalds-criticises-the-securi...
I find i’m writing variations of the same comment every week so I’m just going to share a previous version I wrote if you’ll permit the laziness:
The bugs are bugs description reads pretty insane to me personally but I know linux world has many people valueing principle of it over practical matters.
90d seems long too though.
Think ultimately the big AI houses will need to help the core internet infra guys. Running latest and greatest AI over stuff like nginx and friends makes sense for us all collectively I think
> So many security fixes are coming out now that examining commits is much more attractive: the signal-to-noise ratio is higher
Why?
> Additionally, having AI evaluate each commit as it passes is increasingly cheap and effective
This is the key. With AI, the “people won't notice, with so many changes going past” assumption fails.
AI will shorten update windows dramatically. 2026 is the worst year to be thinking about dependency cooldowns, we need to think about dependency warmups instead.
Soon, there will be no such thing as a safe way to disclose a vulnerability in an open source project. Centralized SaaS will have a major security advantage here.
You could have a web of trust where Linux-using organizations each spend $x continuously scanning and patching their own dependencies with AI, and sending each other patches and scans.
The quick test doesn't show a lot - by out straight asking if this is a security patch, it implies and guides AI to have output more probably to agree on this assumption. A confusion matrix is more useful. Nonetheless of course this is not a detailed ai capability testing blog.
[author]
I agree it is not much additional evidence! If someone wanted to try running the same test on a series of N commits from that list including this one I'd be very curious to see the answer!
Yeah, ideally we would need the phi coefficient (aka MCC, the binary Pearson correlation), which can be calculated from a confusion matrix of yes/no LLM classifications for all kernel diffs. (Number of true positives, true negatives, false positives, false negatives.)
Reverse engineering vulnerabilities from patches is red team 101...
It sounds to me like the safe assumption with software is that no matter how solid your stack is, there are vulnerabilities, potentially catastrophic. A question to folks more experienced than me - if my business depends on software, and I know that my software is almost certainly exploitable, how do I posture my business in such a way as to minimize the impacts of exploits like these?
When Windows was the predominant desktop OS in the 90s and maybe early 00s (ok, maybe still is), it was so badly insecure that you could be pretty much sure that it would be easy to compromise.
That's when firewalls were widely deployed to provide some layer of protection.
So you can ask yourself, what is the (possibly metaphorical) firewall in the software you depend on?
Is there any way you can decrease attack surface, separate out the most important data in extra-secure (and thus less accessible) systems?
The old saying of Tony Hoare about no obvious bugs vs obviously no bugs holds in the age of LLMs more than ever
> Luckily AI can speed up defenders as well as attackers here, allowing embargoes that would previously have been uselessly short.
This is an important facet of the problem space: security risks turning into an arms race for who wants to spend more tokens.
One interesting thing is that this makes closed source code even greater asset for the defenders. Attacker cannot spend tokens for it, but defenders can spend tokens for hardening based on source code, while attacker is stuck with blackbox testing.
I must admit I'm rather enjoying this particular form of shit show, mostly because it was a predication I made in 2023 in the early days of LLMs. It wasn't really a problem related to LLMs but a glaring hole in the thinking of current computing which is the "frustratingly over-connected" and "over-trust" approach to everything. After reading Liu Cixin's "three body problem" and noting the Dark Forest, I applied that to risk vectors and came to the conclusion that our over-connected nature plus some form of acceleration plus some form of negative impact will fuck us big time.
Turns out it did.
Thus we should probably start treating our thinking model of computing as a Dark Forest, not a friendly community. That mitigates these risks to some degree.
I'd argue it's actually breaking three vulnerability cultures. In addition to the two Jeff mentions, I think the culture of delaying upgrades and staying on stable versions for as long as possible is going to become increasingly untenable, if everything that's not latest can be trivially scanned and exploited. In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
There will be much wailing and gnashing of teeth around this, because a lot of tech types really resent having to update constantly, but I don't think people will have a choice. If you have a complicated stack where major or even minor version updates are a huge hassle, I'd start working now to try and clear out the cruft and grease those wheels.
> In the extreme I think there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
It may actually be the opposite.
Debians steady and professional approach on shipping security patches with very little to no functional difference actually enables us to consider and work on automated, autonomous weekly or faster patches of the entire fleet. And once that's in place and trusted, emergency rollouts are very possible and easy.
We have other projects that "move fast and break things" and ship whatever they want in whatever versions they want and those will require constant attention to ship any update for a security topic. These projects require constant human attention to work through their shenanigans to keep them up to date.
Not only that but debian has for example, debsecan so you can see on any system what CVEs exist and if your packages are patched. ex from my system I ran it and got
> CVE-2026-32105 xrdp
which i see has a fix in sid but not on bookworm
> there's a decent chance projects like Debian might have to radically overhaul or just shut down completely - the whole philosophy of slow and steady with old code just won't work.
Debian continuously issues security updates for stable versions, ingestable with automatic updates. “Stable” doesn’t mean that vulnerabilities aren’t getting fixed.
The argument that could be made is that keeping up with getting vulnerabilities fixed might become such a high workload that fewer releases can be maintained in parallel, and therefore the lifetime and/or overlap of maintained releases would have to be reduced. But the argument for abandoning stable releases altogether doesn’t seem cogent.
It goes both ways: Stable code that only receives security updates becomes less vulnerable over time, as the likelihood of new vulnerabilities being introduced is comparatively low. From that point of view, stable software actually has a leg up over continuous (“eternal beta” in the worst case) functional updates.
I can only dream, but this may re-popularize (among the rest of the non-Debian software industry) the general best practice of keeping a "sustaining" branch green, buildable, and with frequent releases, for security fixes.
I hate software that forces you to take new features as a condition of obtaining bug and security fixes. We need to keep old "stable" builds around for longer and maintain them better. I know, I know, it is really upsetting to developers to have to backport things to old versions--they wish that all they had to work on was the current branch. But that just causes guys like me to never upgrade because the downside of upgrading (new features) is worse than the upside (security fixes).
That's not really the culture of debian to be honest. Yes they run old major and minor versions, but they do ship patch updates as fast as they can. Even on debian stable, you absolutely are supposed to update all the time. The culture of "just don't touch it" is a different one (but also exists, I've seen it).
Debian has updated kernel packages out for the stable release. https://security-tracker.debian.org/tracker/CVE-2026-43284
I kind of get your point, but they responded pretty quickly here.
Oh yeah, to be clear: Debian has always been good about quickly shipping patches to kernel vulnerabilities, and they will continue to be so. I was more thinking about whether they will get overwhelmed if every bit of software they package just has a firehose of vulnerabilities on everything which isn't latest.
We are now paying for the sins of our fathers (well and mostly ourselves).
We've just kept building more complex things with more exposure with no recognition that the day of reckoning was coming. And now we are in an untenable situation. With governments spending billions on AI with the big providers it's likely they've found many of these already.
Yep. This is why I am using local AI to edit and build my own copies of Linux kernel, Wayland... everything a distribution would ship really.
Not so daunting for me having come of age when compiling a kernel specific to a hardware platform was essential.
Custom software that does not fit the usual patterns is not fool proof but it won't be obvious.
Monocultures with all their eggs in one basket are even less secure than truly diverse ecosystems though.
Maybe it is about time for Linux to get a real CD/CI and start using AI extensively.
Not just for vulnerabilities, having a nice agents|skills|etc.md definitions would encourage new devs to contribute instead of dealing with an overworked maintener repeating the same thing for n time.