Monorepos with submodules are a specific pain point, but they illustrate a universal truth. Enterprise codebases are not simple. They have shared dependencies, cross-repo type systems, internal packages, custom build tooling, and a hundred undocumented conventions living in the team's collective memory. An agent that cannot handle submodules cannot handle enterprise software.
The fix is not to simplify your codebase for the agent. It is to build agent infrastructure that respects the actual complexity of how your team works.
Clone submodules by default. Provide LSP and language-aware tooling inside the agent's environment. Run the application in the agent's workspace so it can see its own errors. Give teams explicit, declarative control over what context the agent loads, per repo, per task type.
Not glamorous. Not a new model architecture or a clever prompting technique. The work that separates agents that ship code from agents that ship demos.
I've argued elsewhere that context splits into three layers — structural, navigational, operational. The submodule case touches all three at once. The agent does not know the submodule exists (structural). It cannot resolve the canonical type (navigational). It never runs the dependent app to see its own breakage (operational). Fix submodules properly and you are doing all three at the cheapest possible scale.
If you are building an agent platform and submodules feel like an edge case, you have not been inside an enterprise codebase recently. They are not the edge case. They are the median.
Sources
Related Essays
Context Engineering Is the Hard Problem
Models keep getting better, but agents without deep codebase and organizational context are just expensive autocomplete. Context engineering is the bottleneck nobody has productized.
Three Layers of Agent Context, and Most Agents Have Zero
The coding-agent debate is dominated by model capability. Wrong bottleneck. Context splits into structural, navigational, and operational layers — and most agents are missing all three.
Tech Context Is Tractable. Org Context Is Not
The hardest unsolved problem in agent infrastructure is not compute or sandboxing. It is context — and most of that context lives in people, not repos.
Key takeaways
- The fix is not to simplify the codebase for the agent. It is to build agent infrastructure that respects how teams actually work.
- Clone submodules by default. Provide LSP inside the agent's environment. Run the app. Give teams declarative control per repo and per task.
- Unsexy work. The work that separates agents that ship code from agents that ship demos.
FAQ
Why are submodules such a hard case for agents?
Because the canonical version of types and shared logic lives outside the immediate repo. An agent that fails to clone or index submodules ends up reinventing definitions that already exist, producing PRs that look plausible but fight the architecture.
Should we restructure our codebase to be agent-friendly?
No. Restructuring the entire codebase to accommodate a tool you adopted six months ago is the wrong direction. Build agent infrastructure that respects the actual complexity of how your team already works.