The Last Ten Percent

Earlier this year I built an internal service at work — a tool to assist our dev teams. Nothing glamorous. The kind of project that, three years ago, would have been a quarter's roadmap: a kickoff, a design review, a slow accretion of pull requests, something demo-able by month three if nothing slipped.

It stood up in a fortnight.

I don't mean a prototype held together with hope. I mean working code. Real tests — the kind that catch things, not the kind that exist to make a coverage number. A CI/CD pipeline wired end to end, so that a merged change was a deployed change, no ceremony in between. I remember watching the pipeline go green on the first full run and just sitting there for a second. I have been doing this long enough to know what two weeks used to buy you. Two weeks used to buy you a document and an argument about the document. This bought a running service.

The demo went the way you'd hope. Nobody asked whether it would work — they could see it working. They asked when they could have it. Someone asked what else we could build this way, and it was a real question, not a polite one. I walked out of that meeting believing, sincerely, that the shape of my job had changed. I still believe that.

But that was months ago, and I am still working on it. Not because it broke — it never really broke. Because of everything that comes after "it works": the refinements, the bug fixes, the interface that was exactly right until someone actually used it.

The first 90% took 10% of the time. The last 10% took the rest.

The Joke Became a Description

There's an old line about this. Tom Cargill, at Bell Labs, put it this way in 1985: "The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."

The ninety-ninety rule was a joke. It was built as one. The math doesn't work — that's the punchline — and it was a way engineers ribbed each other about optimism: you think you're almost done, you are not almost done, you have never once been almost done. Both nineties couldn't literally be true, and everyone laughing knew it. The rule wasn't a description of software so much as a description of us, of the specific, incurable hopefulness you need in order to start a project at all.

I've repeated it for most of my career the way you repeat a proverb: not because it's precise, but because it's aimed correctly.

Then the joke stopped exaggerating.

AI didn't break the ninety-ninety rule. What happened is stranger: the rule's meaning inverted. The first 90 percent now genuinely is fast — that isn't optimism talking anymore, it's just what the work does now. Which means the "other 90 percent" is no longer a punchline about your bad estimates. It's the plan of record. It's where the schedule actually lives. My internal service is the rule played completely straight: the absurd math turned out to be an accurate ledger of where the months went.

The rule used to warn you about your estimates. Now it tells you where your job went.

But that only sharpens the question the joke used to wave past. If the first 90 percent got fast, why didn't the last 10 percent get fast with it? Same model, same tools, same me. Something in the long tail refuses to compress, and it's worth being precise about what.

Knowing What Right Looks Like

Start with what the long tail was actually made of. On my project it was mostly interface work — reworking screens after real people had touched them. And here is the thing about that work in the age of the model: I was never once waiting on a candidate. I could ask for another version of a screen and have it almost instantly. Another after that. Ten, if I wanted, each one plausible, each one defensible, each one the kind of thing you could put in front of a room without embarrassment.

One of them I nearly shipped. A dashboard-style landing view, and it demoed beautifully — status up top, recent activity summarized, everything a screenshot could want. It survived right up until I watched someone actually use it. The one action people opened the tool to do, the thing they reached for a dozen times a day, sat two clicks down, underneath everything that merely looked important. The layout was organized around what the tool knew instead of what the person wanted. Nothing about it was broken. It was just wrong.

Finding that wrongness was the slow part. Not generating the fix — the fix took minutes once I could name the problem. The slow part was the naming. Knowing which of the plausible candidates is correct, and being able to articulate why the current one isn't, is a different kind of work from producing candidates, and it runs at a different speed. It runs at the speed of noticing: of sitting with a vague dissatisfaction until it resolves into a sentence you can act on.

The model never made that faster. It couldn't. It could hand me options at any rate I cared to ask for them, but choosing among options was still mine, and it happened at exactly the speed it has always happened — my speed. Judgment turned out to be the bottleneck the whole time; the flood of candidates just made it visible.

That's the first thing in the long tail that refuses to compress. Generation scaled; discernment didn't.

The Speed of People

The second thing is simpler, and it was never mine to speed up: refinement needed users. Not my best guess at what a teammate would want — the teammate. Real people opening the tool on a real Tuesday, trying to do their actual jobs with it, and reacting. Every improvement worth making in the long tail started that way: someone used the thing, and something happened that I hadn't predicted.

Loops like that run on the calendar. Shipping was never the slow part — I could turn a request into a deployed change in an hour, sometimes less. Then the waiting started. The team had work of their own; nobody was refreshing my tool to evaluate the latest tweak. Days would pass. Then a comment in a standup, half a sentence — "oh, it still does that thing with the search." Or a chat thread would drift my way, three messages deep in someone else's conversation, describing a friction I would never have found alone. I'd fix it that afternoon. Then the loop reset, and the waiting began again.

That was the actual tempo of the long tail. I could generate at any rate I liked, but whether the last change had actually landed, whether it helped or just moved the annoyance somewhere new, only surfaced when a person had time to bump into the software and mention it. The machine's cycle time is seconds; the loop's cycle time is people.

I never found a way to compress this. You cannot prompt your way past a week of waiting to watch someone use the thing. The model made every lap around the loop cheap — but a loop's speed isn't set by its fastest leg. It's set by its slowest. And the slowest leg was always someone else's Tuesday.

The Moving Target

There's a third thing, and it took me the longest to see because it hides inside the second. The slowness of the loop is one problem: each lap takes a week. This is a different one: each lap moves the finish line. "Done" wasn't a fixed point I was approaching. It moved every time someone used the tool.

The clearest case was a feature the team asked for by name. They described it precisely, I built exactly what they described, and the model made the building almost embarrassingly quick. Within a week of it existing, it was obvious to everyone — including the people who had asked for it — that it wasn't what they needed. What they needed was adjacent, visible only from where the new feature let them stand. The request hadn't been wrong. It had been the best guess available before there was anything real to react to. Using the tool changed what the tool needed to be.

The reflex here is to call this scope creep and reach for discipline: tighter requirements, better specs, firmer nos. I don't think that's what happened. Nobody was careless. This is just how anyone finds out what they actually want — not by imagining it in a document, but by reacting to something real. You can't front-load that discovery, because the thing being discovered doesn't exist yet. Each version of the tool taught us what the next version was for.

Part of the last 10 percent, then, isn't refinement at all. It's finding out what the product is. The model collapsed the distance to something real — a fortnight instead of a quarter. What it couldn't collapse is what happens on the far side of real: people living with the thing, reacting, revising what they meant. AI compresses the time to something real. It doesn't compress the reacting.

The Job Now

Line the three up and they rhyme. Judgment moves at the pace of a person sitting with what feels wrong until they can say why. Feedback moves at the pace of other people's calendars. And done moves, full stop — it relocates every time somebody real touches the thing. Three different walls, and not one of them is a generation problem. You can't get past any of them by producing more, or faster, or better on the first pass. They were never about producing at all.

Which is why I've stopped saying my job shrank. Wrong verb. What happened is closer to distillation. The parts the model absorbed — the boilerplate, the plumbing, the wiring of pipelines, the first draft of everything — turn out to have been, in retrospect, scaffolding. Necessary. Time-consuming. Never the point. Boil all of that off and what remains is concentrated: the part of engineering that was always hardest, no longer diluted by the parts that were merely long.

What remains is the extra mile: taste, judgment, sitting with users long enough to hear the friction nobody would bother filing a ticket about. Knowing — not hoping, knowing — when a thing is actually done. None of that is new. Those were always the skills that separated shipping from demoing; anyone who has ever inherited a "basically finished" project knows the width of the gap between the two. What's new is the proportion. These skills used to be the last chapter of the job. Now they're most of the book.

The ninety-ninety rule used to be a complaint about our estimates; now it reads like a job description.

The service is still open in a tab on my machine. I shipped a change this morning, and I'm waiting to hear whether it landed, and when I hear, there will be something waiting behind it. Months in, this has stopped feeling like a project that won't end. It feels like the work, seen clearly for the first time. The last 10% was always the job. We were just too busy with the other 90% to notice.