Enterprise AI is not converging on a single omniscient agent. It is converging on a mesh of small, atomic, declaratively-defined units, coordinated by humans who supervise hundreds at a time. The harness layer is settled. The LLM gateway is settled. The unsolved problems — context, orchestration, and review — are where every serious company is building custom and where the next layer of enterprise infrastructure will be claimed. I broke this argument into thirteen atomic posts. Read them in any order.
- The Mega-Agent Fantasy Is Already Falling Apart — The seductive pitch and the production nightmare it produces.
- Agents Are Software, Not Prompts — Same engineering principles. There is no exemption.
- The Declarative Atomic Agent — A spec, not a sprawl. Why declarative is the load-bearing property.
- The Agent Stack Build-vs-Buy Map — Seven layers. The harness is commoditized. Context is blue everywhere.
- The Agent Infra Maturity Gradient — Mature reuse, less mature buy, scrappy hand-roll. The opportunity is in the middle.
- Most APIs Are Not Ready for Agents — The component problem. Wrap each vendor in an atomic agent and stay swappable.
- Review Is Not a Screen. It Is a Primitive. — Build for artifact types or rebuild for every use case.
- Non-Determinism Demands Human Correction Loops — How agents actually get smart, and why the mesh makes correction tractable.
- Organizational Context Is the Hardest Problem — Not a search problem. A knowledge management problem.
- The Grayscale Between Engineering and Everywhere Else — One platform, many product surfaces.
- Agents Are About to Break Out of Engineering — Airflow pattern, applied to GTM, ops, finance, CS.
- One Human Will Supervise Hundreds of Agents — Why the cap moves from seven to hundreds, and what supervision becomes.
- The Gap Is Infrastructure, Not Intelligence — Where the real engineering happens.
The intelligence layer will keep getting better whether you contribute or not. The infrastructure will not. Build the mesh one atomic agent at a time, treat review as a primitive, solve context for organizations, and the companies that get those four things right will own the next layer of enterprise AI infrastructure.
— Ry
Sources
Related Essays
The Agent Infra Maturity Gradient
Mature engineering orgs reuse existing dev infra. Less mature orgs buy off the shelf. Scrappy teams hand-roll everything. The opportunity sits in the gap between them.
Most APIs Are Not Ready for Agents
Commercial software was built for humans clicking through UIs, not for agents making programmatic decisions at speed. The component problem is real and underrated.
The Convergent Agent Stack
Fifty companies building internal agent platforms have independently arrived at the same architecture. That convergence is the productization tell.
Key takeaways
- Mega-agents fail in production. The right unit is an atomic, declaratively-defined agent that can be tested, audited, and swapped without touching the rest of the mesh.
- The agent harness is fully commoditized — Claude Code and OpenCode won. There is nothing to build or buy at that layer.
- Context, memory, orchestration, and session state are where every serious company is rolling their own. That is where the real infrastructure gap sits.
- Review is not a UI screen — it is a primitive. The contract between an agent that produces work and a human who verifies it is the underbuilt piece of the stack.
- 2026 is the year agents break out of engineering. The platforms built for code agents will not survive contact with GTM, ops, and customer success.
- One human will supervise hundreds of agents, not seven direct reports — but only if the observability and review layers are good enough.
FAQ
What is an atomic agent mesh?
An architecture where enterprise AI is composed of many small, declaratively-defined agents instead of a single mega-agent. Each atomic unit has a defined input, output, and purpose, so it can be tested, audited, and swapped without touching the rest of the system.
Which layers of the agent stack should you build vs. buy?
The harness is commoditized — almost every serious team is on Claude Code or OpenCode, and there is nothing to build there. LLM gateways and external integrations are mostly bought. Context, memory, orchestration, session state, and review are where every company is rolling their own, and that is where the real infrastructure gap sits.
Why is review treated as a primitive rather than a UI?
Review is the contract between an agent that produces work and a human who verifies it, and that contract differs by artifact type — code diffs, documents, spreadsheets, visual changes. Building review as a screen gives you a feature; building it as a primitive that accepts artifact types and returns appropriate verification surfaces gives you something composable across use cases.