Once you get past the codebase environment problem, a second problem shows up. The maintenance burden of running agents is not one thing. It splits into two categories with opposite cost curves.
Infrastructure maintenance decreases over time. Where the agent runs, how it connects to your existing surfaces, how it gets triggered — one-and-done. You wire up the execution environment, connect it to your issue tracker, and it holds.
Context maintenance never stops. Every team I have spoken with describes the same loop. Review agent output, update instruction files, tune the deterministic checks, adjust what the agent fetches and when. One engineer described his entire workflow as pure context engineering — creating docs, building verification steps, ensuring the agent pulled the right check at the right time. Another described it as a continuous feedback loop of reviewing PRs and updating configuration files. Forever.
This is why context engineering is hard to productize. Infrastructure is generic; you ship it once. Context is specific to your codebase, your team, your domain, your customers — and it changes every week. Every organization's context is unique, which means every customer needs a different version of the product. I've argued elsewhere that this is exactly why the forward-deployed model is the only one that actually delivers value for enterprise agent rollouts.
If you are evaluating agent platforms, the question is not whether the infrastructure works. It does. The question is who owns the context, and how does it get better over time. Pick the platform whose answer to that question is honest.
Sources
Related Essays
Context Engineering Is the Hard Problem
Models keep getting better, but agents without deep codebase and organizational context are just expensive autocomplete. Context engineering is the bottleneck nobody has productized.
Tech Context Is Tractable. Org Context Is Not
The hardest unsolved problem in agent infrastructure is not compute or sandboxing. It is context — and most of that context lives in people, not repos.
Three Layers of Agent Context, and Most Agents Have Zero
The coding-agent debate is dominated by model capability. Wrong bottleneck. Context splits into structural, navigational, and operational layers — and most agents are missing all three.
Key takeaways
- Agent infrastructure stabilizes. You wire it up once, and it holds.
- Context maintenance is forever — instruction files, deterministic checks, retrieval tuning, a continuous review loop.
- Infrastructure is generic and ships once. Context is specific to your team and changes every week, which is why it does not productize cleanly.
FAQ
Why does context maintenance never end?
Because your codebase, team, and customers change every week. Each change rewrites what the agent should know to act correctly. Unlike infrastructure, which is a one-time integration, context is a living surface that decays the moment you stop tending it.
How should I evaluate agent platforms with this lens?
Stop asking whether the infrastructure works. It does. Ask who owns the context, how it improves over time, and what tooling exists for humans to inspect and edit it. That is where the long-term value lives.