Why does RAG fall short for code generation?

Vector retrieval gives you a similarity approximation of context. For code, that approximation drops critical structural relationships — imports, type signatures, call graphs, file layout. A sandbox with the real repo and standard tools beats it on any non-trivial task.

Why Sandboxes Beat Vector RAG for Code Generation

There is a technical insight that matters for anyone building or buying agent infrastructure. Most enterprise AI architectures default to RAG — vectorize your codebase, retrieve relevant chunks, feed them to the model. It works for question-answering. It is a poor fit for code generation.

Background coding agents work differently. They clone the actual repository into an ephemeral sandbox. The agent uses standard tools — grep, find, the file system itself — to locate relevant code. It might take sixteen attempts to find the right file. That is fine. Compute is cheap. What matters is that when it starts writing code, it has real context, not a vector similarity approximation of context.

The sandbox gets destroyed when the task is done. There is no persistent state to manage, no vector index to keep in sync with your codebase. Stateless, disposable, accurate. This is the architecture that actually works for enterprise code generation at scale.

Vector RAG made sense in 2023 when context windows were small and retrieval was the only way to fit a large codebase into a model's working memory. That constraint has loosened. Long contexts are cheap, repository-aware tools are good, and the operational overhead of maintaining a vector index in sync with a moving codebase outweighs the latency savings.

I've argued that agents are software, not prompts. The architecture follows from that. If you would not run your CI off a vectorized snapshot of your codebase, do not run your code-generation agents off one either. Give them the real repo, the real tools, and a clean room to work in.

Sources

Why Sandboxes Beat Vector RAG for Code Generation

Related Essays

Context Is the Moat — Don't Give It Away

The Codebase Is the Territory. The Agent Needs a Map

Tech Context Is Tractable. Org Context Is Not