Stripe Minions | Ry Walker Research

Key takeaways

1,000+ pull requests merged per week with zero human-written code
Built on a fork of Block's open-source Goose agent
MCP server ('Toolshed') provides 400+ internal tools for context

FAQ

What is Stripe Minions?

Stripe's internal coding agent system that writes code end-to-end and ships over 1,000 merged pull requests per week, with human review but no human-written code.

How do Stripe Minions work?

Engineers invoke agents via Slack, CLI, or web. Agents run in isolated pre-warmed 'devboxes,' access 400+ internal tools via MCP, and produce PR-ready code within 2 CI rounds.

What is Stripe Minions built on?

Minions are built on a fork of Block's open-source Goose coding agent, extended with deep internal tool integrations.

Executive Summary

Stripe Minions is the most detailed public case study of enterprise in-house coding agents. The system produces over 1,000 merged pull requests per week — with humans reviewing the code but writing none of it. Built on a fork of Block's open-source Goose agent, Minions integrate deeply with Stripe's existing developer infrastructure through a central MCP server called "Toolshed" with 400+ internal tools.

Attribute	Value
Company	Stripe
Type	Internal tool (not for sale)
Foundation	Goose fork (Block)
Public Documentation	February 2026
Headquarters	San Francisco, CA

Product Overview

Stripe Minions are fully unattended coding agents designed for "one-shot" tasks — given a ticket, bug report, or specification, they produce a complete pull request without human intervention during the coding process. The key innovation is deep integration with Stripe's existing developer tooling, allowing agents to work with the same context and tools as human engineers.

Key Capabilities

Capability	Description
End-to-end coding	Writes code from start to finish based on tickets/specs
Multi-surface invocation	Slack (primary), CLI, web UI, internal tool integrations
MCP context	400+ tools via central "Toolshed" server (docs, tickets, Sourcegraph)
Parallel execution	Multiple minions run simultaneously without git conflicts

Product Surfaces

Surface	Description	Notes
Slack	Tag agent in any thread to invoke	Primary interface
CLI	Command-line for power users	Secondary
Web	Visibility and management	Administrative
Internal tools	Deep integration with ticketing, feature flags	Stripe-specific

Technical Architecture

Minions run on pre-warmed "devboxes" — isolated development environments that spin up in 10 seconds. These environments are identical to what human engineers use, but isolated from production and the internet for security.

Architecture Flow

Invocation (Slack/CLI/Web)
    ↓
Devbox (isolated sandbox, 10s spin-up)
    ↓
MCP Server (Toolshed: 400+ tools)
    ↓
Agent Loop (Goose fork)
    ↓
Local lint (under 5s) → CI (max 2 rounds) → Auto-fix
    ↓
Pull Request (human review required)

Key Technical Details

Aspect	Detail
Foundation	Fork of Block's Goose agent
Execution	Pre-warmed devboxes (10-second spin-up)
Context	MCP server with 400+ tools
CI Strategy	Local lint (under 5s), max 2 CI rounds, auto-apply fixes
Isolation	No production access, no internet

Strengths

Proven scale — 1,000+ PRs merged weekly, validated in production at one of the world's most demanding codebases
Deep integration — MCP server provides agents the same context human engineers have (docs, tickets, build status, Sourcegraph search)
Fast iteration — 10-second devbox spin-up enables rapid feedback loops
Leverages existing infra — Uses same environments as human engineers, reducing agent-specific edge cases
Parallelization — Engineers run multiple Minions simultaneously without git worktree conflicts

Cautions

Requires massive codebase investment — Stripe has hundreds of millions of lines of code and years of devex tooling; this isn't replicable overnight
Human review bottleneck — Agents write code but don't merge; humans must still review everything
Stack-specific patterns — Ruby/Sorbet environment; some patterns may not transfer to other stacks
Dedicated team required — "Leverage team" maintains the system full-time
Not for sale — This is internal tooling, not a product you can buy

Competitive Positioning

vs. Other In-House Agents

System	Differentiation
StrongDM Factory	Minions require human review; StrongDM eliminates it entirely
Ramp Inspect	Similar architecture; Minions have deeper MCP integration
Coinbase Claudebot	Minions built on Goose fork; Coinbase appears Anthropic-native

When to Reference Stripe Minions

Reference when: Building in-house agents at 1,000+ engineer scale
Consider alternatives when: Team under 100 engineers, no existing devex investment
Buy instead when: Need agents in weeks, not quarters (consider Tembo, Claude Code, Codex)

Ideal Customer Profile

This is internal tooling, not a product for sale. However, the architecture is worth studying if you are:

Good fit for similar build:

Engineering organization with 1,000+ engineers
Existing devex team (3+ engineers) that can be redirected
Codebase exceeding 10M LOC with proprietary frameworks
Mature internal tooling (ticketing, CI, monitoring)

Poor fit:

Teams under 100 engineers (buy instead)
Standard tech stack without proprietary frameworks
No existing devex investment
Need results in weeks, not quarters

Viability Assessment

Factor	Assessment
Documentation Quality	Excellent (detailed public blog post)
Replicability	Difficult (requires massive devex investment)
Open Source Foundation	Yes (Goose is open source)
Architecture Maturity	High (production-validated at scale)
Transferability	Medium (principles transfer; implementation is Stripe-specific)

Stripe's detailed documentation makes Minions the reference architecture for enterprise in-house coding agents. While the specific implementation requires Stripe-level investment, the patterns — MCP context, isolated sandboxes, CI integration, Slack invocation — are universally applicable.

Bottom Line

Stripe Minions represent the gold standard for in-house coding agents at elite engineering organizations. The key insight: agents need the same context and tools as human engineers, not a bolted-on integration.

Key metrics: 1,000+ PRs/week, 10-second devbox spin-up, 400+ MCP tools, max 2 CI rounds.

Architecture pattern: Slack invocation → isolated sandbox → MCP context → CI loop → human review → merge.

Recommended study for: Engineering leaders evaluating build vs. buy for coding agents, devex teams designing agent infrastructure.

Not recommended for: Small teams, organizations without existing devex investment, anyone expecting a product they can purchase.

Outlook: Stripe has defined the reference architecture. Expect commercial vendors (Tembo, Devin, etc.) to offer "Stripe Minions in a box" for organizations that can't build their own.

Research by Ry Walker Research • methodology

Disclosure: Author is CEO of Tembo, which offers agent orchestration as an alternative to building in-house.

Sources