The most consistent observation from engineers at large companies is that code production is no longer the constraint. AI has made writing code dramatically faster. Everything surrounding the code has not.
At Shopify, deploying the main monolith can take hours. A PR merged in the morning might not reach production until the next day's deploy. Feature flags enforce pacing — five hours minimum to get a flag from 0% to 100%. A trivial change can take days. A meaningful change with a couple of fix loops can take a week.
At Yum, the bottleneck has shifted further upstream. Their multi-agent systems — coordinator agents, investigator agents, action agents — handle thousands of incidents per hour. The engineering challenge is not generating code or even orchestrating agents. It is managing the explosion of output. When AI can produce 45 PRs in a single repo in a day, the constraint becomes code review and human verification.
The new bottleneck is taste. Taste does not scale with token throughput.
It is not enough to generate code faster. The platform has to understand the deployment pipeline, the flag rollout, the verification step, the review velocity. Internal teams have a massive advantage here because they understand their own deployment reality intimately. That advantage is also their weakness — they are building for one deployment reality, on top of an already overburdened engineering organization. I've argued elsewhere that the iteration cycle belongs to users, not the developers who built the agent. Taste at scale needs to live closer to the work, not further from it.
Sources
Related Essays
Code Review Becomes the Bottleneck
When an agent ships a working PR every six minutes, you accumulate reviewable code faster than humans can process. The next wall is review, not generation.
Review Is Not a Screen. It Is a Primitive
Build review as a UI screen and you have a feature. Build it as a primitive that takes an artifact type and returns a verification surface and you have leverage.
Human Review Is Not a Limitation
Human review is not the bottleneck to be eliminated. It is the quality gate that keeps AI-generated slop from compounding into technical debt that takes years to unwind.
Key takeaways
- AI made writing code dramatically faster. Everything surrounding the code — deploy pipelines, feature flags, review — did not move at the same pace.
- At Shopify the monolith deploys can take hours and a flag rollout has a five-hour minimum. A trivial change can take days. A meaningful change with a fix loop can take a week.
- When AI ships 45 PRs in a single repo in a day, the constraint becomes review and human verification. The new bottleneck is taste, and taste does not scale with token throughput.
FAQ
Where has the engineering bottleneck moved?
Away from code generation and toward everything that surrounds the code — deployment pipelines, feature flag pacing, review velocity, and human verification of agent output. The new constraint is the human judgment layer, not the keyboard.
What does taste mean in this context?
Taste is the human judgment about whether the change is right — does it fit the architecture, match the product intent, handle the edge cases. Token throughput is cheap. Taste is bottlenecked by the number of experienced humans who can apply it.