Why is context engineering harder than infrastructure?

Infrastructure layers — runtimes, dev containers, CI/CD integrations — tend to be one-and-done efforts that rarely break once set up. Context layers — agents.md files, skill definitions, memory journals, deterministic check libraries — need continuous iteration. Longer context degrades performance, so the loop never closes.

Skills Are Software, Not Markdown

Q: What makes a markdown skill insufficient?

A markdown description tells the agent what exists but not how to operate on it reliably. Real skills need executable tools wired to specific endpoints, automated tests, schema validation, and memory of prior runs so the agent does not waste effort rediscovering the environment.

The biggest lie in agent infrastructure right now is that you can describe a skill in a markdown file and call it done. A skill that says "when the user asks about pipeline metrics, query the analytics database" is not a skill. It is a wish. And wishes do not survive contact with production data.

A real skill has executable tools wired to specific endpoints, automated tests on those tools, validation that the data shape matches what the agent expects, memory of what happened the last hundred times the agent ran this workflow, and integration tailored to the actual data model — not a generic description of the data model. When an agent needs analytics data, a markdown description of the platform is not enough. The skill needs to specify exactly which tables, which event names, which API endpoints. Without that, the agent burns enormous effort on discovery, trying to figure out where data lives instead of answering the question. You watch this happen in real time and realize the model is fine. The harness is starving.

Agent platforms shipping "skill marketplaces" full of markdown configurations are selling something closer to a wiki than to software. Skills need to be treated like any other software artifact — versioned, tested, reviewable, with deterministic checks that run before the agent uses them and observability for what happened when it did. A CTO agent told to never create PRs and only ship to main is not a markdown note. That is a governance rule enforced in the deployment pipeline.

The corollary I keep coming back to: context engineering is the real work, not the infrastructure plumbing. Standing up an EC2 instance with bash scripts and a YAML config is something a competent engineer can do in a week. Getting the context right — the docs, the deterministic checks, the retrieval logic that puts the right information in front of the model at the right time — is the work that never ends. The infrastructure is one-and-done. The context is continuous iteration, and longer context documents degrade performance, so the loop never closes.

I've made the broader case that the harness itself is commoditized. The skill layer is where the differentiation lives. If you are not treating skills like software, you are not building skills.

Sources

Skills Are Software, Not Markdown

Related Essays

The Harness Is Commoditized. Everything Else Is Not

You Need a Directory of Agents

Users Should Iterate on Agents, Not Developers