The Average Path, Now With Better Grammar
I've been intensively using Opus 4.7 for a day. Let me tell you what I think.
I was sitting there, listening to Cola Falls from The Mary Onettes - it’s a great song, highly recommend it.
As I was verbally abusing Opus for the 13th time in the last hour, I decided, counter to the fanboys who are announcing that Opus 4.7 will conquer the world, to publish a few thoughts…
It’s better. It really is.
Individual tasks come back tighter, cleaner, with fewer of those little tells that make you squint and mutter “what are you doing, friend?” at your screen at 11pm on a Tuesday. Correctness, on the task level, has moved. I’m not going to pretend it hasn’t.
Now. Here’s the thing, and it’s not a small thing.
The strategic mindset — the part that looks at a problem and says no, the whole framing is wrong, what we should actually be building is something else — that part isn’t there.
Neither is multi-repo coordination and support. And multi-agent coordination still is not friction free. Really?
SendAgentMessage isn’t available mid-stream? Can’t sneak a message into that agent’s context window? Something seems broken in the agent coordination and orchestration mechanism.
I still have to bring the inspiration.
I still have to bring the architecture.
I still have to be the one who says “stop, back up, you’re solving the wrong problem.”
Because left to its own devices, the model will confidently, articulately, lovingly produce the average answer to the question I literally asked — not the question I should have been asking.
I wrote about this a while back in The Average Path. The thesis holds. Opus 4.7 is a more accurate average path. It is not a different path. It is not your path.
It is the middle of the road, paved with better asphalt.
And listen — the middle of the road is fine. For a lot of work, the middle of the road is exactly what you want.
But if you came here because someone on a podcast told you the software industry is doomed, I have news for you, and the news is good: it is not doomed. It is, in fact, doing just fine, thank you, because turning vision into systems — real systems, systems that survive contact with customers — is not something that falls out of a language model when you ask it nicely.
You still have to know what you want. You still have to know why. And if you hand a model your half-formed intuition and say “proxy my vision,” you will pay for that. Not metaphorically. Actually. In tokens, in rewrites, in three-week detours that end with you reverting to the branch you had before you started.
The natural language compiler only compiles what you articulate. Garbage vision in, eloquent garbage out.
About Mythos
This is the part where Anthropic waves Mythos at me. And I want to be fair about this, because I’m about to be unfair.
Mythos found a zero-day in FreeBSD.
That’s real. That’s a genuine achievement, and I’m not going to stand here and pretend otherwise — this is a meaningful moment for the industry. Full stop.
But.
(You knew there was a but.)
They ran the thing thousands of times. The cost was roughly ten thousand dollars. And I will take the bet, right now, today, in writing, that if you ran Opus 4.7 a couple thousand more times at maybe twenty-five thousand dollars in compute, you’d find the same zero-day. Or one just like it. Because what we are describing, when we describe this, is not a new kind of cognition.
It is a generator and a roulette wheel and enough spins to land on red.
That’s not a criticism of the achievement — it’s a description of the achievement.
The right read here is token optimization. The right read is affordability. The right read is: security researchers have been producing zero-days for roughly ten thousand dollars of labor for years, and the resulting vulnerabilities have generated billions in value. So before we gasp at the economics, let’s actually look at the economics.
Also, between us — was the name an accident? Mythos? Was that on purpose? Nobody in that room said “hey, maybe we shouldn’t name the model after a word that means things people made up“? Nobody? All right. Moving on.
Productivity. Not replacement.
This is a productivity play. Not a replacement play. A productivity play. And I say that as someone who uses these models heavily, every day, in anger, in production, for real work.
The tools are getting better. My leverage is going up. My need to hire is going — actually, my need to hire is going up too, because now I can start more things, and starting more things means more people, not fewer.
That’s the secret of productivity. It’s always been the secret of productivity.
But “productivity tool” does not move a valuation the way “replacement for all knowledge work” moves a valuation. So the CEOs keep saying AGI. They keep fear mongering total replacement.
They keep describing a future in which I, personally, am obsolete by the end of the fiscal year. And I look at my terminal, where I have just spent forty-five minutes patiently explaining to the most advanced model on the market that no, we do not want to refactor the whole module, we want to fix the one line, and I find myself unmoved.
I find myself, in fact, a little tired.
The actual problem
Put a pin in the valuations for a minute, because there’s something else. Something I think is more interesting, and nobody’s talking about it.
From one of my sources at Anthropic, I know that ten percent of the users are generating ninety percent of the tokens.
Ten percent.
That is not a technology problem. That is not a model problem. That is not something that gets fixed by Opus 4.8 or Opus 5 or whatever we’re calling the next one.

That is an adoption problem. That is a long-tail problem. That is a “how do we get the other ninety percent of the people past the blinking cursor” problem.
And it is the actual problem. And it is not being solved by benchmarks. It is, in fact, being actively not solved by benchmarks, because the people writing benchmarks are not the people staring at the blinking cursor.
Benchmark-obsessed versus customer-obsessed. I keep saying it, it keeps being true, nobody keeps doing anything about it.
The battle is now quietly turning towards token optimization versus capability and breaking the adoption barriers, starting with improving the customer experience and consumption model.
I wrote a piece recently called Allbirds is Pivoting to AI Infrastructure. The title is a joke. The underlying point is not. When the narrative gets this overheated, everyone pivots to infrastructure, nobody pivots to the customer, and the companies that eventually win are the ones who remember which side of the equation actually signs the checks.
Where I land
The capability plateau is real. Not because the models aren’t improving — they are, measurably, and I’m grateful for it.
The plateau is real because “better average” is bumping up against the ceiling of what “average” can do for you when what you need is judgment. And judgment doesn’t live in the weights.
It lives in the person at the keyboard who knows what they’re trying to build and why.
Which, apparently, is still my job.






