The market's attention has been captured by coding agents — tools that sit in your editor or terminal and help you write code faster. That is a real and valuable category, and it is also the most crowded one. Every lab is shipping coding tools. Every startup with a seed round is building an editor plugin.
Background agents are different. They run autonomously, in the cloud, on a schedule or in response to events. They produce reviewable output. A human approves before anything executes. This is the pattern enterprises actually need — not a copilot sitting next to a developer, but an autonomous worker handling knowledge work across the organization.
Almost every agent product on the market today is a toy by enterprise standards. There is no hierarchy for organizing agents at scale. No versioning. No audit trail showing who changed an agent, when, and why. No observability for tracking reliability over time. When you have a hundred employees, each with three agents — some public, some private, some critical to operations — the current generation of tools falls apart.
I heard a story recently about a company that spent three days trying to figure out which of their agents was telling prospects they were hosting their next event in London. They were not. There was no way to trace it, no way to query across agent activities, no way to find the source of the hallucination. This is insane, and it is the norm.
Every enterprise deploying agents needs a company-wide, live log of activity. Not just token counts — though you should have those too — but real observability into what decisions agents are making and why. Every agent needs a changelog and a chain of accountability. This is software engineering, not prompt engineering. I've argued elsewhere that the harness layer has no moat — the next layer up, the one that makes background agents actually safe to run in production, is where the durable products live.
Sources
Related Essays
The Atomic Agent Mesh: Architecture, Build-vs-Buy, and the Review Layer
Enterprise AI will not be one mega-agent. It will be a mesh of atomic, auditable units, and the companies that nail review and context will own the next infrastructure layer.
Triggered Workflows Generate Most of the Volume
Most enterprise agent value comes from background workflows, not from humans typing into a chat box. Machines do not sleep. Lean into triggered work or get out-shipped.
The Mesh, Not the Monolith
One mega-agent that handles everything is exhilarating to demo and chaotic in production. Enterprise wants a mesh of specialized agents with human pilots.
Key takeaways
- Background agents — autonomous, scheduled, reviewable — are the pattern enterprises need, not editor copilots.
- Almost every agent product on the market today is a toy by enterprise standards. No hierarchy. No versioning. No audit trail. No observability.
- One company spent three days hunting which of their agents was lying to prospects. There was no way to trace it. This is the norm.
- Every enterprise deploying agents needs a company-wide live log of activity, a changelog per agent, and a chain of accountability. This is software engineering, not prompt engineering.
FAQ
How are background agents different from coding copilots?
A copilot sits next to a developer and helps them type faster. A background agent runs autonomously, on a schedule or in response to events, produces reviewable output, and waits for a human to approve before anything executes. Enterprises need the second pattern across the whole organization, not the first one in the IDE.
What does enterprise-grade observability look like for agents?
It is more than token counts. You need a company-wide, live log of agent activity, the ability to query across agent decisions, a changelog per agent, and a chain of accountability for who changed what and when. Without that, debugging a misbehaving agent is impossible at scale.