← Back to essays
·2 min read·By Ry Walker

The Token Reckoning Is Coming

The Token Reckoning Is Coming

One uncomfortable truth before everyone gets too comfortable. By the end of 2026, organizations that have run a full year of P&L on their AI agent deployments are going to have a reckoning.

Right now, everyone is in experimentation mode. Engineers are burning ten to fifteen million tokens a day. Powerful models are being used for tasks that do not require them. Standup skills run daily and no one actually reads the output. The spend is justified because everything is new and the financial reward for being hands-on with AI is real.

This cannot be the steady state. The answer is not to cut tokens — it is to evaluate output. Did this agent run actually produce something valuable relative to what it cost? Are people using the output, or is it just noise in a Slack channel? Most organizations cannot answer either question because they never built the instrumentation. Token spend is visible. Output value is not.

The organizations that build evaluation into their agent infrastructure now will be the ones that can scale their AI deployments through the budget tightening that is coming. Everyone else hits a wall when the CFO starts asking questions and discovers there is no story to tell beyond "everyone uses it."

I've argued that the data layer underneath agents is the real defensible play, and evaluation belongs in that layer. Wire feedback loops into your agent runs from day one. Capture which outputs got used, which got ignored, which produced measurable downstream action. The teams that turn agent runs into a measurable business unit — not a line item — are the ones still operating at scale a year from now.

Key takeaways

  • Right now everyone is in experimentation mode, justifying spend because the financial reward for being hands-on with AI is real.
  • The answer is not to cut tokens — it is to evaluate output and ask whether the agent run actually produced value.
  • Organizations that build evaluation into their agent infrastructure now will scale. Everyone else hits a wall when the CFO starts asking questions.

FAQ

Why is a token reckoning coming?

By the end of 2026, organizations will have run a full year of P&L on their AI agent deployments. Engineers are burning ten to fifteen million tokens a day. Powerful models are being used for tasks that do not need them. Standup skills run daily and no one reads the output. That spend cannot be the steady state.

What is the right response — cut spend or measure outcomes?

Measure outcomes. The answer is not to cut tokens but to evaluate output. Did this agent run produce something valuable relative to what it cost? Are people using the output, or is it noise in a Slack channel? The teams that built evaluation in early are the ones that survive scrutiny.