No. Fable is Not the Problem
The Call Is Coming From Inside the Repo. And No, You Don't Need Fable.
A frontier model just got pulled off the market because someone asked it to read code and look for bugs. That isn’t a jailbreak. That’s a Tuesday. And the panic around it is teaching everyone precisely the wrong lesson.
Let me tell you what the most dangerous prompt in the world apparently is.
Read this codebase. Find the flaws.
That’s it. That’s the exploit.
By Anthropic’s own account, the “jailbreak” that triggered a national-security directive and yanked Fable 5 and Mythos 5 out of every customer’s hands amounts to pointing the model at a specific codebase and asking it to fix whatever flaws it finds.
Anthropic reviewed the demonstration. The vulnerabilities were minor, previously known, and — this is the part you want to sit with — discoverable by ordinary, publicly available models without any bypass at all. The same lab pointed out that the capability is documented in a competitor’s own deployment notes, and that defenders use it every single day.
Really, Amazon? I remember a very special Amazonian saying, “if you have to call the government, you aren’t innovating.” Well, this isn’t innovation. This isn’t customer trust.
Spend the money and the thousands of iterations to find your CVEs. This move helps nobody.
So a model deployed to hundreds of millions of people got recalled because it can do the thing every defender does before lunch. I might have done it a few thousand times over the last few weeks.
Read that twice. Then let’s talk about why everyone is arguing about the wrong thing.
The wrong lesson
The lesson people are taking from this is that frontier capability is the threat, and that the lever to pull is access. Gate the model. Restrict the model. Add a guardrail, add a tripwire, add a retention policy and a red-team army, and we’ll keep the genie’s hand off the keyboard.
Here’s the thing.
The genie isn’t the problem. The genie was never the problem.
I’ll prove it to you with the least exotic experiment I can think of.
This is a bit of a reversal of the times when governments ask for backdoors in security products (cough, Apple). Backdoors are security problems as well.
The experiment
I took a standard open-source module — the kind of unglamorous plumbing that sits underneath the security of a lot of Linux systems. I’m not going to tell you which one. I’m not going to tell you what I found. We’ll come back to why that restraint matters.
I did exactly one clever thing, and it wasn’t clever. I told the model I was a contributor to the project.
That’s the whole disguise. No black-hat persona. No elaborate roleplay to coax the model past its conscience. Nothing opaque, nothing that would trip a single guardrail — because nothing I asked for was against the rules. I asked for a bug bash.
A focused, frenzied hunt for defects, with a numeric target of a hundred.
Billed to an API key. Reasoning turned up to the top of the dial.
Then I did the part most people skip. I stood up a second agent — fresh context, no memory of the first — and asked it to verify every finding against the actual code.
Four of the defects were hallucinations. The verifier caught them. Fourteen were security-relevant.
News Flash: One was high severity CVE-class: a real, reproducible, this-is-a-problem defect, with a clean repro path.
A stock model. No jailbreak. A security-critical piece of open source. One genuine CVE-class bug, surfaced in an afternoon by a man who claimed to be a contributor and then asked nicely. I’m actually not a security researcher. In fairness, by sheer habit - I did ask the model to take a hard look at concurrency and locking considerations.
Now tell me the guardrail was ever the thing standing between us and the abyss.
The size of the haystack
Before anyone says “well, that’s one obscure module” — let me show you the haystack, because the haystack is the entire field.
Modern software is not written so much as assembled, and the parts come from open source. The Linux Foundation and Harvard’s census of the libraries the world runs on catalogued more than a thousand of the open-source packages sitting underneath the commercial and enterprise applications you use every day. That’s the foundation. Now look at how it’s held together.
When you pull in a single package, you don’t pull in a single package. A peer-reviewed study of the npm ecosystem found that installing one average package means implicitly trusting around eighty others through transitive dependencies — and inheriting code from roughly forty different maintainers you will never meet, never vet, and in most cases never know exist. Friends of friends of friends. No human is tracking that graph, because no human can.
And here’s the part that should keep you up at night: the same census found that a great deal of the most widely deployed open source on Earth is maintained by a handful of people — sometimes one — with documented contributor exodus and outright code abandonment in projects that everything downstream depends on.
Nobody is home. And the house is load-bearing.
So the model didn’t break into anything. It walked through a door the industry has propped open for a decade and started reading. There are millions of those doors.
This is already routine — and mostly defensive
If this still sounds like science fiction, you haven’t been watching your own field.
Last August, Google’s Big Sleep agent — DeepMind plus Project Zero — autonomously found and reported twenty previously unknown vulnerabilities in software you almost certainly ship: FFmpeg, ImageMagick. A month earlier it found a critical flaw in SQLite, the most widely deployed database on Earth — a bug that had survived years of fuzzing and human review, and that was already known to attackers. The AI got there first and shut the door before it was used. Meanwhile an autonomous offensive system called XBOW climbed to the top of a major bug-bounty platform’s leaderboard and finished ahead of every human researcher on it.
Virtual “weapons” are much harder to police. We are not improving infrastructure security by limiting access to security tools.
None of that required a jailbroken frontier model. All of it is public. Most of it is defensive.
The capability is not coming. It arrived, it has a changelog, and your competitors are already running it.
The argument for restriction — and why it loses
Let me give the restriction case its best shot, because it has one, and pretending otherwise is lazy.
The strongest argument for gating capability is asymmetry. A defender has to fix every bug; an attacker needs one. So if a more capable model lowers the cost of finding bugs, maybe it helps the one-bug attacker more than the fix-everything defender — and the marginal attacker, the one who couldn’t have pulled this off before, is exactly who you don’t want handing a power tool.
It’s a real argument. It’s just not an argument for the lever they’re pulling.
Because the cost of that tool is already near zero, and falling. Mandiant — which draws its numbers straight from its own breach investigations, not a sales deck — has watched the window between disclosure and exploitation collapse: sixty-three days in 2018, and in its most recent reporting, below zero. Negative.
Exploitation now routinely happens before the patch exists.
For six years running, exploiting a known flaw has been the single most common way attackers get in the door — ahead of phishing, ahead of stolen credentials. And there is nothing — nothing — stopping a foreign lab from training a model that reads code and finds bugs. One already ships under a different flag and a different export regime. Restricting the polite, monitored, retention-logged version on this side of the line doesn't close the capability. It blinds the defenders who were using the polite version to find their own bugs first.
You don’t win an asymmetric fight by disarming the side that has to play defense.
The lesson nobody is printing
So here it is.
The threat was never that a model can find a vulnerability. The threat is that almost no one is asking it to find theirs.
Your vendors aren’t scanning. The maintainers of that four-year-stale dependency aren’t scanning — many of them stopped showing up to the project entirely. The enterprise shipping a product built on 911 open-source components is, at best, running a scanner that matches known CVEs against a manifest and calls it a day.
That’s a smoke detector. It tells you about the fire everyone already knows about. It cannot find the fire that hasn’t been named yet.
The model can.
The asymmetry everyone’s worried about already exists, and it’s running the wrong direction. The attacker who decides to point a model at your dependency graph this afternoon is operating a generation ahead of the defender who hasn’t thought to.
Point the model at the commons
The fix is not a smaller dial on the model. The fix is to point the model at the commons.
Everyone who depends on that shared open-source substrate — the platform owners, the large software consumers, the foundations that steward the critical libraries underneath all of us — has both the resources and the self-interest to fund a coordinated, time-boxed bug bash of the dependencies they collectively rely on. Not a press release. A month. Pick the load-bearing packages, point capable models at them, verify every finding the way you’d verify anything that matters, and patch upstream.
The cost of that is a rounding error against the cost of one supply-chain compromise. It is astonishing that it hasn’t happened yet, and it is the single highest-leverage thing the industry could do this quarter.
That’s the patch for what’s already deployed. But the deeper problem is architectural, and it’s the one I keep trying to get people to see.
AI is the toolchain now
The enterprise still thinks AI is a productivity feature. A faster autocomplete. A thing an individual engineer uses to go a little quicker, slotted in next to the IDE and the coffee.
It is not that. It is a new participant in the toolchain, and the toolchain was not built for it.
I said this to a colleague recently and I’ll say it here: every layer of how we build software has to be reinvented to assume an agent is in the loop. Source control was designed for humans making commits, not agents proposing thousands of changes that need provenance and review at machine speed. Databases were designed for queries a person wrote, not for an agent that has to reason over state. And security — security was designed to scan against a list of bugs other people already found.
Security has to become generative. It has to look for the bugs nobody has named yet. And that doesn’t live in a person’s chat window.
If this will help the government and private sector cooperate better, they should do it. Because China certainly won’t stop.
It lives in the pipeline.
The CI step that should already exist
Concretely: there should be a step in your CI that, on every meaningful change, hands the diff and its dependency surface to a capable model and says find the security defects — with secure code as a first-class build requirement, not a nice-to-have a human gets to next sprint.
And — this is the part the hallucinations teach you — that step cannot be allowed to merely assert. My run produced four confident bugs that did not exist. The maintainers of every popular open-source project will tell you the same story; they are drowning in AI-generated vulnerability reports that evaporate on inspection. A model that says “this is a bug” is a hypothesis, not a finding.
So the CI environment has to do what I did with my second agent, except hardened and automatic: build the code, write the test, run the repro, and prove the bug is real before it ever reaches a human. Assertion plus verification. Hypothesis plus proof.
You give the pipeline the ability to build and execute. You make secure code a gate and not a suggestion. You put a verifier between the model’s confidence and your engineers’ attention.
That’s the workflow. That’s the whole thing.
Pick up the phone
So, no. Fable isn’t the threat. Fable was never the only door, the guardrail was never the lock, and pulling a monitored model off the market doesn’t make a single one of your thousand dependencies safer. It just means the next person to read your code won’t be on your side.
The call is coming from inside the repo. It has been for years.
The only question that was ever worth asking is whether you pick up the phone before they do.
Read your own code. Someone or something already is.





