There is a temptation to see human review as a temporary constraint — something we will automate away once the models get good enough. This is wrong, and it is dangerous.
Human review is not the bottleneck to be eliminated. It is the quality gate that prevents AI-generated slop from compounding into technical debt that takes years to unwind. The organizations currently shipping AI-built products without engineering review are building on sand. They do not know it yet because the failures have not cascaded.
The pattern that works is: context in, background execution, reviewable output, human approval. The agent does the work. The human exercises judgment. Code does not merge without a human saying yes.
This is not a conservative position. It is the only position that scales. The alternative — letting AI-generated code flow into production without review — is how you end up with 18,000 water bottles ordered at a Taco Bell. The error compounds silently because nobody is watching the seam where machine output meets the real world.
I've argued that the bottleneck has moved from generation to review. That shift is not a problem to solve. It is the design constraint that makes the rest of the system work. Invest in review tooling, review skills, review at scale. The teams that treat human judgment as the most valuable input — not the most expensive overhead — are the ones whose AI deployments will still be working in three years.
Related Essays
Review Is Not a Screen. It Is a Primitive
Build review as a UI screen and you have a feature. Build it as a primitive that takes an artifact type and returns a verification surface and you have leverage.
The Three Phases of AI-Assisted Engineering
Autocomplete, interactive agents, background agents. The bottleneck has moved from generating code to reviewing it, and almost no one is building for the new constraint.
Taste Does Not Scale With Token Throughput
Code production is no longer the constraint. Deploy pipelines, feature flags, and code review are. The new bottleneck is taste, and taste does not scale.
Key takeaways
- Treating human review as a temporary constraint to be automated away is wrong and dangerous.
- The pattern that scales is context in, background execution, reviewable output, human approval — code does not merge without a human saying yes.
- Organizations shipping AI-built products without engineering review are building on sand and do not know it yet.
FAQ
Should we try to automate human review out of the loop?
No. Review is the quality gate that prevents AI-generated slop from compounding into technical debt that takes years to unwind. Removing it is how you end up with 18,000 water bottles ordered at a Taco Bell — a system that looked fine until it cascaded.
What is the right operating pattern for background agents?
Context in, background execution, reviewable output, human approval. The agent does the work. The human exercises judgment. Code does not merge without a human saying yes. This is not a conservative position — it is the only position that scales without compounding errors.