A developer files a ticket. The agent picks it up. The task is trivial — swap one field for another in an existing type. The agent produces a PR. The PR is wrong in a way no human on the team would ever get wrong.
It created a brand new type, bolted the field onto it, and completely ignored that the correct type already lived in a submodule two directories over. It did not check. It did not know to check. It had no idea the submodule was there.
This is not a model intelligence problem. This is a context problem. It is the single biggest reason enterprise agent deployments stall after the first week of excitement.
When your developers work locally, they have the full monorepo cloned, submodules included. They have a language server catching type errors as they type. They have years of institutional knowledge about where things live. When an agent picks up a ticket, it gets a repo clone in a container. Maybe the submodules came along. Maybe they did not. It has no LSP. It has no watcher. It has no idea your types live in a shared submodule six other projects depend on. So it does what any reasonable actor does with incomplete information: it guesses. Confidently, because that is what language models do.
The result is a PR that looks plausible to someone who has never seen the codebase, and obviously wrong to anyone who has. I've argued elsewhere that context splits into three distinct layers, and the structural layer — knowing where things live — is the one that breaks first.
If your agents are producing this kind of PR, do not buy a better model. Fix the environment. Clone the submodules. Wire up the language server. Then ask the agent again.
Sources
Related Essays
Three Layers of Agent Context, and Most Agents Have Zero
The coding-agent debate is dominated by model capability. Wrong bottleneck. Context splits into structural, navigational, and operational layers — and most agents are missing all three.
Controllability Is Not Optional. Enterprise Teams Do Not Want Magic
Enterprise teams do not want magic agents. They want control over which submodules load, which tools run, and what the agent remembers — because they have been burned by black boxes before.
Non-Determinism Demands Human Correction Loops
Agents are non-deterministic by nature. The way they get smart is through human correction at scale, and the atomic mesh is what makes that tractable.
Key takeaways
- The most common coding-agent failure is not bad code. It is plausible code that ignores the existing structure of the codebase.
- When the agent does not see your submodules, it confidently invents what it cannot find.
- The gap between demo and deployment is made entirely of context, not intelligence.
FAQ
Why did the agent create a new type instead of using the existing one?
Because it never saw the existing one. The agent ran in a container that did not clone the relevant submodule, had no language server to resolve the type, and no way to know the canonical definition existed two directories over.
Is this a model intelligence problem?
No. A junior engineer with access to the full repo and a working LSP would find the existing type in seconds. The agent was missing the environment, not the IQ.