← Back to research
·9 min read·company

Coinbase Forge

Coinbase's internal coding agent Forge (originally Claudebot, then Cloudbot) produces 5% of all merged PRs, cut PR cycle time from 150h to 15h, and serves 1,000+ engineers via Slack, Linear, and MCPs. Now part of LangChain's Open SWE reference architecture.

Key takeaways

  • 5% of all merged pull requests come from Forge background agents (as of February 2026), with PR cycle time reduced from ~150 hours to ~15 hours (10x improvement)
  • May 2026: Coinbase disclosed Mux, an internal multi-agent orchestration layer with 600+ users that powered 5,068 merged PRs across 461 repos — Mux users merge 3.5x more PRs per engineer (39.6 vs 11.4)
  • Name evolved: Claudebot (original) to Cloudbot (multi-model rebrand) to Forge (current). Multi-model, explicitly not Claude-only.
  • LangChain's Open SWE framework was built to capture the architectural pattern that Stripe (Minions), Coinbase (Cloudbot), and Ramp (Inspect) independently converged on
  • Linear is the structured source of truth — Forge reads from Linear for context, auto-creates issues from Slack conversations, and assigns bugs to itself
  • 500+ PRs in 15 minutes during company-wide speedrun sessions. Crashed GitHub four or five times.

FAQ

What is Coinbase's internal coding agent called?

Forge. Originally called Claudebot (built on Claude), then Cloudbot (after going multi-model). The Linear case study and recent sources use the name Forge.

How does Forge work at Coinbase?

Engineers trigger Forge from Slack, GitHub, or Linear. Automated workflows pick up Slack conversations and create Linear issues. Bugs get auto-assigned to Forge, which accesses the codebase and Linear context, writes a fix, drafts a PR, creates a mobile build, and pushes it back to Slack for review.

What percentage of Coinbase PRs come from AI agents?

5% of all merged PRs as of February 2026, with PR cycle time reduced 10x from ~150 hours to ~15 hours.

Is Coinbase's coding agent built on Claude?

Not exclusively. The original version was Claude-based (hence Claudebot), but it evolved to be multi-model. Chintan Turakhia said it uses all sorts of underlying models and is not specific to Claude.

What is the Open SWE connection?

LangChain built Open SWE (March 2026) to capture the architecture that Stripe, Coinbase, and Ramp independently converged on for internal coding agents. Forge/Cloudbot is one of the three reference implementations.

What is Mux at Coinbase?

Mux is Coinbase's internal multi-agent orchestration tool, disclosed in a May 2026 engineering blog post. It started as one engineer's side project and grew to 600+ users, powering 5,068 merged PRs across 461 repositories. Mux users merge 3.5x more PRs per engineer than baseline (39.6 vs 11.4). It sits alongside Forge in Coinbase's internal agent stack, solving the concurrency problem of running multiple agents in parallel.

Executive Summary

Coinbase's internal coding agent — now called Forge, previously Cloudbot and originally Claudebot — produces 5% of all merged pull requests (as of February 2026) and reduced PR cycle time from ~150 hours to ~15 hours.[1] Built by just two engineers initially, it has evolved into a Slack-native, Linear-integrated system serving 1,000+ engineers, with MCP servers providing tool access. LangChain identified Forge (alongside Stripe's Minions and Ramp's Inspect) as one of three independent implementations of the same architectural pattern, which they open-sourced as Open SWE.[2] In May 2026 Coinbase disclosed a second layer of the stack: Mux, a multi-agent orchestration tool with 600+ internal users.[3]

AttributeValue
CompanyCoinbase
Current nameForge
Previous namesCloudbot, Claudebot
TypeInternal tool
Adoption5% of merged PRs (as of Feb 2026)
PR cycle time150h → 15h (10x)
Initial team2 engineers
InterfaceSlack, Linear, GitHub

Name Evolution

The naming confusion is worth clarifying:

  • Claudebot — the original name, reflecting its Claude-based origins. Used in early internal references and Chintan's February 2026 tweet.[1]
  • Cloudbot — the name used by LangChain, DevOps.com, and most third-party coverage after the system went multi-model. Chintan clarified: "it's actually using all sorts of underlying models. It's not something that is specific to Claude."[4][2]
  • Forge — the current name, used in the Linear case study (the most detailed public source). Described as "a custom agentic harness his team built that can be executed from Slack, Github, and Linear."[5]

How It Works

The Linear case study provides the clearest picture of Forge's workflow:[5]

Bug-to-Fix Pipeline

Slack message flags a bug
    ↓
Automated workflow creates Linear issue
    ↓
Issue auto-labelled, sized, and assigned
    ↓
Bug assigned to Forge (automatic)
    ↓
Forge accesses codebase + Linear context
    ↓
Writes fix, drafts PR, creates mobile build
    ↓
Pushes back to Slack for human review

Question Handling

Forge also handles engineering questions. Instead of pulling a teammate out of deep work, questions in Slack get routed to agents: Linear handles project status and timelines, Forge handles code-specific queries like debugging services, analyzing failing endpoints, or investigating funnel dropoffs.[5]

Linear as Source of Truth

This is a key architectural decision. Chintan told his team to treat Linear as the single source of truth for everything — product requirements, designs, bug reports, shipping status, team structure. The reason: agents need structured, accessible context to work autonomously. Scattered knowledge across Slack threads and people's heads is navigable by humans but a wall for agents.[5]

"I'm not designing things for humans anymore," Chintan says. "I'm designing things for agents. And they need different things."[5]


The "Delete Your IDE" Experiment

In January 2026, Chintan asked his entire engineering organization to delete their IDEs and write zero lines of code for two weeks. Every engineer at Base had to do their job without touching a code editor. The goal: figure out how work actually gets done when agents handle most of the coding.[5]

This forced the team to identify which workflows depended on human coding (and shouldn't) versus which genuinely required human judgment.


Speedrun Culture

The adoption mechanism is worth studying:[5][6]

  • Weekly "Speedruns" — power users demo new agent workflows to the entire org, then everyone does it live
  • First speedrun (early 2025) — 120 engineers pushed 80 PRs in one sitting, crashed GitHub
  • CEO Brian Armstrong attending by mid-year — 500+ PRs in 15 minutes
  • Crashed GitHub 4-5 times (Chintan admits with pride)
  • Applied AI Award presented by CEO at quarterly town halls

The adoption wasn't driven by training programs or change management. Engineers didn't learn a new behavior — they asked questions in Slack like they always did. The only difference was who answered.[5]


Continuous Development Pattern

Engineers now operate in a cycle:[5]

  1. Wake up, review PRs agents generated overnight
  2. Spin off 10-15 new agents to work throughout the day
  3. Go heads-down on complex work themselves
  4. End of day: review, launch another batch overnight
  5. Repeat

Chintan tracks "autonomous operation time" as a key metric — how many minutes an agent can run without human intervention. That number keeps climbing.[5]


Open SWE Connection

In March 2026, LangChain released Open SWE — an open-source framework capturing the architectural pattern that three companies built independently:[4][2]

  • Stripe — Minions
  • Coinbase — Cloudbot/Forge
  • Ramp — Inspect

The convergence: all three arrived at cloud sandboxes, Slack/Linear invocation, subagent orchestration, and automatic PR creation. Open SWE provides these components as a customizable framework built on LangGraph and Deep Agents.[7]

One additional detail from third-party analysis: Forge auto-merges its PRs when tests pass and automated review is positive.[4]


Mux: The Multi-Agent Orchestration Layer (May 2026)

In May 2026, Coinbase's engineering blog disclosed a second piece of the internal agent stack: Mux, a multi-agent orchestration tool that started as one engineer's side project and grew organically to 600+ users across every org.[3]

The framing: AI coding agents made individual engineers faster, but the workflow around them stayed sequential — coding had a concurrency problem. With Mux, one engineer runs three or four agents in parallel (one implementing an API, one writing integration tests, one fixing a bug, one refactoring a legacy module), reviewing each as it finishes.[3]

Disclosed metrics (as of May 2026):[3]

MetricValue
Internal users600+
Merged PRs powered5,068
Repositories461 (across 10 orgs)
Merged PRs per engineer39.6 vs 11.4 baseline (3.5x)

This confirms the trajectory the Linear case study described: Coinbase engineers evolving from implementers into orchestrators of agent fleets, with Forge handling task-level automation and Mux handling parallelism.


What Developers Say

Practitioner discussion of Forge and Mux specifically is thin as of June 11, 2026. The Coinbase Mux engineering post was submitted to Hacker News in May 2026 but drew no comments,[8] and searches of HN and X surfaced no verbatim engineer testimonials about Forge under any of its three names — the substantive HN threads about Coinbase and AI date to the 2025 adoption-mandate controversy, before Forge was publicly named. All detailed accounts of the system come from Coinbase leadership (Chintan Turakhia) via the Linear case study and podcast appearances, not from rank-and-file engineers. The absence of independent practitioner voices is worth noting when weighing the disclosed metrics.


Strengths

  • Proven at scale — 5% of merged PRs across 1,000+ engineers (as of February 2026) is real production impact
  • Cultural adoption mastery — speedruns, CEO buy-in, public Slack channels, awards. This is a playbook for enterprise AI adoption.
  • Linear integration — structured source of truth for agents, not scattered knowledge
  • Full lifecycle — bug detection in Slack → issue creation → code fix → PR → mobile build → review, all automated
  • Continuous development — agents work overnight, humans review in the morning. Development never stops.
  • Architecture validated — three independent implementations converged on the same pattern, now open-sourced as Open SWE

Weaknesses

  • Internal only — not a product for sale, limited replicability without deep study
  • 5% is still early — Ramp claims 30%, Abnormal 13%. Room to grow.
  • Linear dependency — the "source of truth" approach requires committed buy-in to a specific tool
  • Cost not disclosed — running agents at this scale with multiple models isn't cheap
  • Name confusion persists — Claudebot/Cloudbot/Forge creates ambiguity in external references

Competitive Positioning

CompanyAgent Name% of PRsKey Metric
RampInspect30%Highest disclosed adoption
Abnormal AIInternal13%Growing fast
CoinbaseForge5%500+ PRs in 15 min speedruns
StripeMinionsN/A1,000+ PRs/week

Relevance to Tembo

Forge validates several patterns central to Tembo's thesis:

  1. Slack-native invocation — meet developers where they are, don't make them learn a new interface
  2. Structured context — agents need a source of truth (Linear), not scattered conversations
  3. Continuous development — agents working overnight while humans sleep and review in the morning
  4. The orchestration convergence — three major companies independently built the same architecture. This is the canonical pattern.
  5. Cultural adoption matters as much as technical capability — speedruns, CEO buy-in, and public wins drove adoption more than the tooling itself

The Open SWE framework makes this architecture accessible to any team. The question for Tembo: how do you provide this as a managed service rather than asking every company to build it themselves?


Bottom Line

Forge remains the best-documented in-house coding agent case study: 5% of merged PRs (as of February 2026), a 10x PR cycle-time improvement, and an adoption playbook (speedruns, CEO buy-in, Linear as source of truth) that other enterprises are copying via Open SWE. The May 2026 Mux disclosure shows Coinbase has moved past single-agent automation to fleet orchestration — 600+ engineers running parallel agents, merging 3.5x more PRs each. The caveat stands: every number comes from Coinbase itself, with no independent practitioner accounts yet on the public record.


Research by Ry Walker Research

Disclosure: Author is CEO of Tembo, which offers agent orchestration as an alternative to building in-house.