The End of Estimates

When implementation gets cheap, sizing means risk, not effort

Jun 27, 2026

An agent tells you there are two ways to build the feature.

Option A is cleaner, but it will take two weeks. Option B is rougher, but it will take one week.

Then the same agent implements both in ten minutes.

That should break something in our heads.

For decades, software organizations arranged themselves around one question: how long will this take? We built tickets, backlogs, story points, sprint planning, velocity charts, roadmap forecasts, and whole layers of management around the attempt to answer it.

The question made sense. Code took time. A week of engineering work was a real cost. A feature had to earn that week. A refactor had to justify itself against other work. When a product manager chose between two options, they needed to know which one consumed more of the scarce thing.

The scarce thing was implementation.

It is not anymore.

The old bundle

An estimate was never only about hours. It bundled effort, ambiguity, risk, coordination, and verification.

In practice, all of those collapsed into time. Complexity meant the work might take longer. Ambiguity meant we might discover more work. Risk meant we should pad the estimate. Verification meant QA, review, staging, and the slow path to confidence.

That bundle worked when implementation dominated the cost.

Now one piece of the bundle has collapsed. The first implementation is cheap. The agent can produce a plausible answer quickly. It can produce two plausible answers. It can build the clean version and the rough version before the meeting where the team would once have argued whether the ticket was a three or a five.

The rest of the bundle did not collapse.

Ambiguity still costs. Blast radius still costs. Verification still costs. Irreversible mistakes still cost.

The absurd estimate

The absurdity shows up when an agent estimates its own work.

Ask for options and it often answers like a senior engineer from 2019. Option A takes two weeks. Option B takes one week. Option C should wait until next quarter.

The answer sounds reasonable because it comes from the old world. In that world, implementation time anchored the estimate. Bigger meant longer. Cleaner meant slower. Riskier meant more padding.

Then you ask the agent to try option A. It does. You ask it to try option B. It does that too. The implementation cost that anchored the estimate disappears.

The agent was not completely wrong. Option A may still be larger. It may touch more systems, need more product judgment, expose more edge cases, or require a careful rollout.

But "two weeks" hid the real reason. It bundled risk into time because time used to be the common currency.

Now the bundle misleads.

A change can be large even if the agent writes it in ten minutes. A change can be small even if it edits fifty files. Size no longer lives in the typing. Size lives in the uncertainty around the result.

Size the risk

We should stop sizing work by implementation effort and start sizing it by risk.

Four questions matter more than the old number:

How clear is the requirement?

How wide is the blast radius?

How hard is verification?

How easy is rollback?

A task is small when the requirement is clear, the blast radius is local, verification is cheap, and rollback is trivial.

Change the copy on a settings page. The agent can implement it quickly. A human can verify it visually. If it is wrong, the rollback is easy. That is small.

Change a vesting calculation for a rare equity edge case. The agent may still implement it quickly. The diff may be tiny. The risk is not. The requirement depends on domain meaning. The blast radius includes money, contracts, reports, and trust. Verification needs fixtures, historical cases, and someone who understands the rule. Rollback may not repair bad downstream decisions if customers already saw the wrong result.

That is large.

The old estimate might have called both tasks small if the code looked simple. The new estimate cannot.

Estimate trust

The end of estimates is not the end of planning.

Planning still matters. Sequencing still matters. Coordination still matters. Some work still takes days or weeks because organizations move slowly, customers need time, compliance gates exist, data must migrate, and humans must decide what they want.

But "how long will this take?" is losing its power. The agent can produce the code before the estimate finishes circulating.

The ticket should ask different questions now. What is the acceptance contract? What paths can this break? What evidence must the agent produce? Who needs to judge the result? Can we roll it back? Does the harness already exist?

Those questions preserve the useful part of estimation meetings. The value was rarely the number. The value was the argument before the number: one engineer noticing the migration, another remembering the weird customer, QA asking how anyone would test it, product clarifying what "done" means.

Keep that conversation. Change its target.

The scarce resource is trust.

Can we trust the requirement? Can we trust the generated change? Can we trust the tests? Can we trust the rollout? Can we trust ourselves to understand the thing we are about to ship?

That is what sizing should measure now.

Risk, evidence, and reversibility.

Estimate those.

Working Frontier

Discussion about this post

Ready for more?