← Back to research
·6 min read·company

Babysitter

Babysitter is an orchestration framework for Claude Code that enables deterministic, event-sourced workflow management with quality gates, human-in-the-loop approval, and resumable runs.

Key takeaways

  • Deterministic, event-sourced orchestration framework for Claude Code with resumable runs and complete audit trails
  • Quality convergence — agents iterate until quality targets are met, not just run once and hope
  • Human-in-the-loop breakpoints for structured approval gates during complex workflows
  • 2,000+ pre-built process definitions covering common development workflows

FAQ

What is Babysitter?

Babysitter is an orchestration framework for Claude Code that enables deterministic, event-sourced workflow management with quality gates, human approval checkpoints, and automatic iteration until quality targets are met.

How does Babysitter work?

Babysitter runs an iterate-execute-record loop: advance process, get pending effects, execute tasks, post results, repeat until complete. Everything is event-sourced in .a5c/runs/ for full resumability and audit trails.

How much does Babysitter cost?

Babysitter is free and open-source under the MIT license. It requires Claude Code as the underlying agent.

Who competes with Babysitter?

Claude-Flow for swarm orchestration, wshobson/agents for plugin-based multi-agent coordination, and BMAD Method for full agile lifecycle methodology.

Executive Summary

Babysitter is an orchestration framework for Claude Code that brings deterministic, event-sourced workflow management to AI coding agents. Unlike frameworks that provide static skills or methodology guidance, Babysitter actively manages multi-step workflows with quality gates, human approval breakpoints, and automatic iteration until quality targets are met. [1]

AttributeValue
CompanyA5C AI
Founded2026
FundingNot disclosed
LicenseMIT
GitHub Stars317

Product Overview

Babysitter sits between methodology frameworks (which tell agents what to do) and runtime orchestrators (which manage how agents execute). It provides an event-sourced execution loop where every step is recorded, resumable, and auditable. [1]

The core innovation is quality convergence: instead of running a workflow once, Babysitter iterates until quality targets are met, with agent scoring and parallel execution support. Human-in-the-loop breakpoints provide structured approval gates — not just "approve/reject" but context-rich checkpoints.

Key Capabilities

CapabilityDescription
Event-Sourced ExecutionComplete journal of all events in .a5c/runs/
Quality ConvergenceIterate until quality targets met, not one-shot
Human BreakpointsStructured approval gates with full context
Parallel ExecutionRun tasks concurrently with dependency management
Deterministic ReplayResume or replay any run from any point
Process Library2,000+ pre-built process definitions
Agent ScoringEvaluate agent output quality programmatically

Installation

Babysitter installs as a Claude Code plugin: [2]

npm install -g @a5c-ai/babysitter@latest @a5c-ai/babysitter-sdk@latest @a5c-ai/babysitter-breakpoints@latest
claude plugin marketplace add a5c-ai/babysitter
claude plugin install --scope user babysitter@a5c.ai
claude plugin enable --scope user babysitter@a5c.ai

Invoke via /babysitter:call or natural language in Claude Code.


Technical Architecture

Babysitter runs an iterate-execute-record loop:

  1. Advance process — Determine next steps from process definition
  2. Get pending effects — Identify tasks that need execution
  3. Execute tasks — Run tasks (potentially in parallel) via Claude Code subagents
  4. Post results — Record outcomes to event journal
  5. Repeat — Continue until all tasks complete or quality gates pass

Key Technical Details

AspectDetail
RuntimeNode.js 20+ (22.x LTS recommended)
AgentClaude Code (required)
StateEvent-sourced journal in .a5c/runs/<runId>/
ExecutionSequential and parallel task execution
LanguageJavaScript
Open SourceYes (MIT)

Strengths

  • Deterministic workflows — Event-sourced execution means runs are reproducible and resumable
  • Quality convergence — Iterates until quality targets are met, unlike one-shot frameworks
  • Human-in-the-loop — Structured breakpoints with rich context for meaningful approval gates
  • Audit trail — Complete journal of every event, task, and decision
  • Large process library — 2,000+ pre-built processes cover common development workflows
  • Composable — Works with existing Claude Code subagents, skills, and tools
  • Active development — Pushed today (Feb 23), rapidly evolving

Cautions

  • Claude Code only — Locked to a single agent runtime; no Codex, OpenCode, or Gemini support
  • Early stage — 317 stars, 7 weeks old; limited production validation
  • Complexity — Event-sourced architecture adds overhead vs. simpler skill-based approaches
  • Process library quality — 2,000+ processes is a bold claim for a 7-week-old project; quality unverified
  • Unknown company — A5C AI has no disclosed funding, team, or track record
  • Node.js dependency — Requires global npm install and Claude Code plugin system

Pricing & Licensing

TierPriceIncludes
Open SourceFreeFull framework, MIT license

Hidden costs: Requires Claude Code subscription (Anthropic API costs)


Competitive Positioning

Direct Competitors

CompetitorDifferentiation
Claude-FlowClaude-Flow does swarm orchestration with consensus protocols; Babysitter does deterministic event-sourced workflows with quality convergence
wshobson/agentswshobson/agents is plugin-based multi-agent coordination; Babysitter is process-driven single-agent orchestration
BMAD MethodBMAD provides agile lifecycle methodology; Babysitter provides runtime workflow execution with quality gates
SuperpowersSuperpowers enforces methodology via skills; Babysitter enforces via deterministic process execution

When to Choose Babysitter Over Alternatives

  • Choose Babysitter when: You need deterministic, resumable workflows with quality convergence and audit trails in Claude Code
  • Choose Claude-Flow when: You need distributed multi-agent coordination with formal consensus
  • Choose wshobson/agents when: You want composable plugin-based agents with broad tool coverage
  • Choose Superpowers when: You want methodology enforcement without runtime orchestration overhead

Ideal Customer Profile

Best fit:

  • Teams using Claude Code who need structured, multi-step development workflows
  • Organizations requiring audit trails and approval gates for AI-generated code
  • Developers working on complex features that benefit from iterative quality improvement
  • Teams that want "iterate until right" instead of "run once and fix manually"

Poor fit:

  • Teams not using Claude Code (no multi-agent support)
  • Developers wanting lightweight skills without orchestration overhead
  • Organizations needing multi-model or multi-agent flexibility
  • Quick one-off tasks where orchestration is overkill

Viability Assessment

FactorAssessment
Financial HealthUnknown — no disclosed funding
Market PositionNiche — Claude Code-specific orchestration
Innovation PaceRapid — active daily commits, 7 weeks old
Community/EcosystemNascent — 317 stars, 13 forks
Long-term OutlookUncertain — depends on Claude Code ecosystem growth and A5C AI viability

The event-sourced, quality-convergence approach is genuinely novel in the skills framework space. Most competitors either provide static methodology (Superpowers, BMAD) or distributed coordination (Claude-Flow, wshobson). Babysitter's niche — deterministic workflow execution with iterative quality — is underserved. The risk is Claude Code lock-in and the unknown viability of A5C AI.


Bottom Line

Babysitter brings software engineering rigor — event sourcing, deterministic replay, quality convergence — to AI agent orchestration. The approach is more sophisticated than static skills frameworks and more focused than distributed multi-agent platforms.

Recommended for: Claude Code users who need structured, auditable, iterative workflows for complex development tasks.

Not recommended for: Teams wanting agent flexibility, lightweight skill catalogs, or multi-model support.

Outlook: If Babysitter proves the quality-convergence model works in practice, expect the pattern to be adopted by larger orchestration platforms. The Claude Code lock-in limits addressable market, but the core innovation — iterate until quality targets are met — is platform-agnostic in principle.


Research by Ry Walker Research • methodology