Key takeaways
- 1,300+ pull requests merged per week with zero human-written code (up from 1,000+ at first disclosure)
- Built on a fork of Block's open-source Goose agent
- MCP server ('Toolshed') provides nearly 500 internal tools for context
- 'Blueprints' orchestration primitive mixes deterministic workflow nodes with agent loops
FAQ
What is Stripe Minions?
Stripe's internal coding agent system that writes code end-to-end and ships over 1,300 merged pull requests per week, with human review but no human-written code.
How do Stripe Minions work?
Engineers invoke agents via Slack, CLI, or web. Agents run in isolated pre-warmed 'devboxes' (standardized AWS EC2 instances), access nearly 500 internal tools via the Toolshed MCP server, and produce PR-ready code within 2 CI rounds.
What is Stripe Minions built on?
Minions are built on a fork of Block's open-source Goose coding agent, extended with deep internal tool integrations.
Executive Summary
Stripe Minions is the most detailed public case study of enterprise in-house coding agents. As of February 2026, the system produces over 1,300 merged pull requests per week (up from 1,000+ at first disclosure) — with humans reviewing the code but writing none of it. Built on a fork of Block's open-source Goose agent, Minions integrate deeply with Stripe's existing developer infrastructure through a central MCP server called "Toolshed" with nearly 500 internal tools. Per InfoQ, the code Minions touch supports more than $1 trillion in annual payment volume.
| Attribute | Value |
|---|---|
| Company | Stripe |
| Type | Internal tool (not for sale) |
| Foundation | Goose fork (Block) |
| Public Documentation | February 2026 (Part 1: Feb 9; Part 2: Feb 19) |
| Headquarters | San Francisco, CA |
Product Overview
Stripe Minions are fully unattended coding agents designed for "one-shot" tasks — given a ticket, bug report, or specification, they produce a complete pull request without human intervention during the coding process. The key innovation is deep integration with Stripe's existing developer tooling, allowing agents to work with the same context and tools as human engineers.
Key Capabilities
| Capability | Description |
|---|---|
| End-to-end coding | Writes code from start to finish based on tickets/specs |
| Multi-surface invocation | Slack (primary), CLI, web UI, internal tool integrations |
| MCP context | Nearly 500 tools via central "Toolshed" server (docs, tickets, Sourcegraph) |
| Parallel execution | Multiple minions run simultaneously without git conflicts |
| Blueprints | Orchestration primitive: deterministic workflow nodes wrapped around agent loops |
| Shared rule files | Cursor-format rules synced across Cursor, Claude Code, and Minions |
Product Surfaces
| Surface | Description | Notes |
|---|---|---|
| Slack | Tag agent in any thread to invoke | Primary interface |
| CLI | Command-line for power users | Secondary |
| Web | Visibility and management | Administrative |
| Internal tools | Deep integration with ticketing, feature flags | Stripe-specific |
Technical Architecture
Minions run on pre-warmed "devboxes" — standardized AWS EC2 instances that provision in 10 seconds. These environments are identical to what human engineers use, but isolated from production and the internet for security.
Architecture Flow
Invocation (Slack/CLI/Web)
↓
Devbox (isolated EC2 sandbox, 10s spin-up)
↓
MCP Server (Toolshed: ~500 tools)
↓
Agent Loop (Goose fork)
↓
Local lint (under 5s) → CI (max 2 rounds) → Auto-fix
↓
Pull Request (human review required)
Key Technical Details
| Aspect | Detail |
|---|---|
| Foundation | Fork of Block's Goose agent, tailored for unattended runs |
| Execution | Pre-warmed devboxes — standardized AWS EC2 (10-second spin-up) |
| Context | Toolshed MCP server with nearly 500 tools |
| Orchestration | "Blueprints" — deterministic workflow nodes combined with agent loops |
| CI Strategy | Local lint (under 5s), max 2 CI rounds, auto-apply fixes |
| Rules | Cursor rule format synced across Cursor, Claude Code, and Minions |
| Isolation | No production access, no internet |
Strengths
- Proven scale — 1,300+ PRs merged weekly (as of February 2026), validated in production at one of the world's most demanding codebases
- Deep integration — MCP server provides agents the same context human engineers have (docs, tickets, build status, Sourcegraph search)
- Fast iteration — 10-second devbox spin-up enables rapid feedback loops
- Leverages existing infra — Uses same environments as human engineers, reducing agent-specific edge cases
- Parallelization — Engineers run multiple Minions simultaneously without git worktree conflicts
Cautions
- Requires massive codebase investment — Stripe has hundreds of millions of lines of code and years of devex tooling; this isn't replicable overnight
- Human review bottleneck — Agents write code but don't merge; humans must still review everything
- Stack-specific patterns — Ruby/Sorbet environment; some patterns may not transfer to other stacks
- Dedicated team required — a dedicated internal team maintains the system full-time (team composition not publicly disclosed as of February 2026)
- Not for sale — This is internal tooling, not a product you can buy
What Developers Say
Practitioner reaction on the Hacker News thread for Part 2 (131 points) centers on review quality and the lack of concrete examples:
"I am wondering if a lot of the human review is substantive or rubber stamping - as we see with long Pull Requests from humans. I know I would half-ass a review of a PR containing lots of robot code." — 3rodents
"Code reviews are also an educational moment for seniors teaching juniors as well as an opportunity for people who know a system to point out otherwise undocumented constraints of the system. If people slack on reviews with the agent it means these other externalities suffer." — fnord123
"The glass-half-full here is it's an incredible signal that one of the largest financial gateways in the world is able to do this with current capabilities. Personally, this is exciting." — trevorhinesley
"Where is the detail? Examples? Something concrete? ... Lots of generic statements everyone knows." — gas9S9zw3P9c
The recurring skeptical theme: Stripe has not published example PRs, so outsiders cannot judge whether the 1,300/week figure represents substantive changes or low-stakes mechanical work. No Stripe engineers responded in-thread as of June 2026.
Competitive Positioning
vs. Other In-House Agents
| System | Differentiation |
|---|---|
| StrongDM Factory | Minions require human review; StrongDM eliminates it entirely |
| Ramp Inspect | Similar architecture; Minions have deeper MCP integration |
| Coinbase Claudebot | Minions built on Goose fork; Coinbase appears Anthropic-native |
When to Reference Stripe Minions
- Reference when: Building in-house agents at 1,000+ engineer scale
- Consider alternatives when: Team under 100 engineers, no existing devex investment
- Buy instead when: Need agents in weeks, not quarters (consider Tembo, Claude Code, Codex)
Ideal Customer Profile
This is internal tooling, not a product for sale. However, the architecture is worth studying if you are:
Good fit for similar build:
- Engineering organization with 1,000+ engineers
- Existing devex team (3+ engineers) that can be redirected
- Codebase exceeding 10M LOC with proprietary frameworks
- Mature internal tooling (ticketing, CI, monitoring)
Poor fit:
- Teams under 100 engineers (buy instead)
- Standard tech stack without proprietary frameworks
- No existing devex investment
- Need results in weeks, not quarters
Viability Assessment
| Factor | Assessment |
|---|---|
| Documentation Quality | Excellent (detailed public blog post) |
| Replicability | Difficult (requires massive devex investment) |
| Open Source Foundation | Yes (Goose is open source) |
| Architecture Maturity | High (production-validated at scale) |
| Transferability | Medium (principles transfer; implementation is Stripe-specific) |
Stripe's detailed documentation makes Minions the reference architecture for enterprise in-house coding agents. While the specific implementation requires Stripe-level investment, the patterns — MCP context, isolated sandboxes, CI integration, Slack invocation — are universally applicable.
Bottom Line
Stripe Minions represent the gold standard for in-house coding agents at elite engineering organizations. The key insight: agents need the same context and tools as human engineers, not a bolted-on integration.
Key metrics: 1,300+ PRs/week (as of February 2026), 10-second devbox spin-up, nearly 500 MCP tools, max 2 CI rounds.
Architecture pattern: Slack invocation → isolated sandbox → MCP context → CI loop → human review → merge.
Recommended study for: Engineering leaders evaluating build vs. buy for coding agents, devex teams designing agent infrastructure.
Not recommended for: Small teams, organizations without existing devex investment, anyone expecting a product they can purchase.
Outlook: Stripe has defined the reference architecture and is iterating publicly — Part 2 (February 19, 2026) added Blueprints orchestration and cross-tool rule syncing. Expect commercial vendors (Tembo, Devin, etc.) to offer "Stripe Minions in a box" for organizations that can't build their own.
Research by Ry Walker Research • methodology
Disclosure: Author is CEO of Tembo, which offers agent orchestration as an alternative to building in-house.
Sources
- [1] Minions: Stripe's one-shot, end-to-end coding agents
- [2] Minions: Stripe's one-shot, end-to-end coding agents — Part 2
- [3] Block Goose Agent (GitHub)
- [4] Stripe Engineers Deploy Minions, Autonomous Agents Producing Thousands of Pull Requests Weekly (InfoQ)
- [5] Minions Part 2 discussion (Hacker News)
- [6] Stripe Minions analysis (X/@dejavucoder)