Key takeaways
- First-party sandboxing has arrived: Anthropic's open-source sandbox-runtime (4,388 stars) powers Claude Code's /sandbox tool and Codex ships Landlock + seccomp sandboxing by default — eating the niche the standalone wrappers were built to fill
- Survivors differentiate on what first-party tools won't do: cross-agent support and credential brokering — nono (2,643 stars, from Sigstore creator Luke Hinds) keeps API keys outside the sandbox entirely via a credential-injection proxy
- Three of the five original members are dormant or quiet: landrun stalled in October 2025 (frozen at Landlock ABI v5 while the kernel reached v9), yolo-cage has had no commits since February 2026, and shai is down to maintenance releases at 40 stars
- yolobox is the healthy original — 603 stars, weekly releases (v0.18.4, June 2026), and a pivot from simple safety net to local parallel-agent workbench
- The ~27-tool gold rush cataloged in the wincent "new wave of AI agent sandboxes" HN thread is consolidating — around OS primitives (Landlock, Seatbelt, bubblewrap) rather than containers
FAQ
What's the best local sandbox for AI coding agents?
If you use Claude Code, enable the built-in /sandbox — it's powered by Anthropic's open-source sandbox-runtime. For multi-agent setups, nono adds kernel enforcement plus credential proxying on macOS and Linux, fence adds command deny rules and SSH filtering, and yolobox remains the simplest container-based option.
Do I need a local sandbox if my agent has built-in safety?
Less than before. Claude Code's /sandbox and Codex's default Landlock + seccomp sandbox now provide real OS-level isolation, not just permission prompts. Standalone tools still earn their keep if you run multiple agents (one policy across all of them), want credentials brokered outside the sandbox, or need policy features like command deny rules the first-party sandboxes lack.
How do local sandboxes differ from cloud sandboxes like E2B?
Local sandboxes run on your machine and protect your filesystem. Cloud sandboxes (E2B, Daytona, Sprites) provide remote isolated environments via API. Local is for individual developers; cloud is for production agent infrastructure.
Which sandbox works on macOS?
anthropic-sandbox-runtime, nono, and fence all use Apple's Seatbelt natively on macOS. yolobox (Docker/Podman/Apple Containers) and shai (Docker) work via containers. litterbox and landrun are Linux-only, and yolo-cage's macOS path (QEMU) is experimental and unmaintained.
Executive Summary
AI coding agents are most productive when you let them run without permission prompts — but one bad command and your home directory is gone. A category of local sandboxing tools emerged to solve this: run your agent in full-auto mode while keeping your machine safe.
Since this comparison's March 2026 snapshot, the story has changed. First-party sandboxing arrived. Anthropic open-sourced sandbox-runtime — the isolation layer behind Claude Code's /sandbox tool, which cut permission prompts by 84% in internal testing — and Codex ships a Landlock + seccomp sandbox by default. The agent vendors are eating the niche the standalone wrappers were built to fill, and the original wave shows it: three of the five tools profiled in March are now dormant or quiet.
This is distinct from cloud sandbox platforms (E2B, Daytona, Sprites) which provide remote isolated environments via API. Local sandboxes run on your machine, wrapping your existing agent CLI in a protective layer.
8 tools compared: anthropic-sandbox-runtime, nono, landrun, fence, yolobox, litterbox, yolo-cage, and shai — plus Codex's built-in sandbox.
Key findings:
- The first-party entrant leads: Anthropic's sandbox-runtime (4,388 stars) is the reference implementation the rest of the category now defines itself against
- The strongest independent is nono (2,643 stars) — kernel enforcement on both macOS and Linux, plus a credential-injection proxy that keeps API keys outside the sandbox entirely, from Sigstore creator Luke Hinds
- The market consolidated around OS primitives — Landlock, Seatbelt, bubblewrap — over containers; every healthy new entrant is container-free
- 3 of the 5 original members are dormant or quiet: landrun (stalled October 2025), yolo-cage (stalled February 2026), shai (maintenance mode, 40 stars)
- yolobox is the healthy original — 603 stars, weekly releases, and a pivot toward being a local parallel-agent workbench
- Survivors differentiate on cross-agent support and credential brokering — the two things first-party sandboxes structurally won't provide
The Problem
Every major AI coding agent now ships with a "yolo mode":
| Agent | Full-Auto Flag |
|---|---|
| Claude Code | --dangerously-skip-permissions |
| Codex | --ask-for-approval never |
| Gemini CLI | --yolo |
| Copilot | --yolo |
These modes are where agents are most productive — no interruptions, no decision fatigue, no waiting. But they're also where agents are most dangerous. A misinterpreted prompt can rm -rf ~, overwrite your SSH keys, push to main, or exfiltrate secrets through a dependency install.
Local sandboxes exist to decouple agent autonomy from machine safety. Let the agent go wild inside the sandbox. Keep your actual system untouchable.
Status Check
The defining update of this refresh: the original wave is splitting into survivors and casualties.
| Tool | Status (June 11, 2026) | Evidence |
|---|---|---|
| anthropic-sandbox-runtime | 🟢 Active (beta) | v0.0.54 (June 4), repo pushed June 11 — but "Beta Research Preview," 118 open issues, community PRs report slow responses |
| nono | 🟢 Healthy | v0.62.0 (June 7), same-day commits, weekly releases |
| fence | 🟢 Healthy | ~60 releases since December 2025, pushed June 11 |
| yolobox | 🟢 Healthy | v0.18.4 (June 9), weekly releases, 0 open issues |
| litterbox | 🟢 Active | v0.6.0 (May 27), pushed June 10 — small but steady |
| shai | 🟡 Quiet | Maintenance-grade monthly releases (v0.0.11, June 1); 40 stars, ~455 monthly npm downloads |
| yolo-cage | 🔴 Stalled | No commits since February 1, 2026 — two days after launch |
| landrun | 🔴 Stalled | No commits since October 1, 2025; no release since v0.1.14 (April 2025); frozen at Landlock ABI v5 while the kernel reached v9 |
A dormant security tool is worse than a dormant utility: blocklists, secret patterns, and kernel ABI support all rot. landrun still works — the enforcement lives in the kernel — but it exposes none of the Landlock v6–v9 capabilities (scoped signals, audit logging, Unix socket controls) the kernel has since shipped.
Comparison Matrix
| Tool | Language | Stars | Isolation | macOS | Network Control | Status |
|---|---|---|---|---|---|---|
| anthropic-sandbox-runtime | TypeScript | 4,388 | Seatbelt (macOS), bubblewrap (Linux) | ✅ | Domain-allowlist proxy | 🟢 Active (beta) |
| nono | Rust | 2,643 | Landlock (Linux), Seatbelt (macOS) | ✅ | Allowlist proxy + credential injection | 🟢 Healthy |
| landrun | Go | 2,217 | Landlock LSM | ❌ Linux only | TCP bind/connect rules | 🔴 Stalled (Oct 2025) |
| fence | Go | 794 | Seatbelt (macOS), bubblewrap (Linux) | ✅ | Default-deny + domain allowlist proxy | 🟢 Healthy |
| yolobox | Go | 603 | Docker/Podman/Apple Containers | ✅ | All-or-nothing | 🟢 Healthy |
| litterbox | Rust | 115 | Podman + Landlock | ❌ Linux only | Limited | 🟢 Active |
| yolo-cage | Python | 110 | Vagrant VM + K8s | ⚠️ Experimental | Egress proxy + secret scanning | 🔴 Stalled (Feb 2026) |
| shai | Go | 40 | Docker | ✅ | Allowlist per resource set | 🟡 Quiet |
| Codex built-in | — | — | Landlock + seccomp (Linux), Seatbelt (macOS) | ✅ | Workspace-scoped | 🟢 First-party default |
Two Architectures
The March version of this report framed the market as permissive vs restrictive. That axis still exists, but the more important split is now containers vs OS primitives — and the market picked a winner.
Container-based (yolobox, shai, litterbox, yolo-cage): wrap the agent in Docker/Podman/VM isolation. Broader containment, but a runtime dependency, image management, and a shared-kernel trust boundary.
Container-free OS primitives (anthropic-sandbox-runtime, nono, fence, landrun, both built-in sandboxes): apply Seatbelt, Landlock, bubblewrap, or seccomp policies directly to the process. No daemon, no images, near-zero overhead, restrictions inherited by every child process — and on macOS, nothing to install.
Every healthy new entrant since late 2025 is container-free, and both agent vendors chose OS primitives for their first-party sandboxes. The permissive/restrictive philosophy debate hasn't gone away — yolobox is still "let it rip inside the box" while nono and fence are default-deny — but the architectural question is settled.
Product Profiles
anthropic-sandbox-runtime (srt)
"Enforcing filesystem and network restrictions on arbitrary processes at the OS level, without requiring a container."
The first-party entrant that pressures the whole category. srt is the open-sourced isolation layer behind Claude Code's /sandbox sandboxed Bash tool: Seatbelt profiles on macOS, bubblewrap on Linux, plus an HTTP/SOCKS5 proxy enforcing a domain allowlist on every child process. Anthropic reports an 84% reduction in permission prompts in internal testing.
- Isolation: Seatbelt (macOS), bubblewrap + network namespaces (Linux/WSL2) — no containers
- Defaults: reads allowed (deny rules available), writes denied, network denied
- Traction: 4,388 stars, 320 forks in under eight months; Apache-2.0
- Caveats: "Beta Research Preview" in the
anthropic-experimentalorg; TLS-blind proxy (domain fronting, DNS exfiltration demonstrated); credentials readable by default without explicitdenyReadrules; 118 open issues - Best for: Claude Code users (enable
/sandbox) and teams wanting the reference architecture as a library
nono
"Kernel-enforced capability sandbox for AI agents."
The strongest independent — and the fastest-growing tool in the niche, overtaking landrun's star count in a fraction of the time. Built by Luke Hinds (creator of Sigstore, co-founder of Stacklok) under his new company Always Further. Kernel enforcement via Landlock on Linux and Seatbelt on macOS, irreversible once applied and inherited by all child processes.
- Isolation: Landlock (Linux), Seatbelt (macOS), WSL2; native Windows planned
- Credential injection proxy: API keys never enter the sandboxed process — a trusted proxy injects them into outbound requests from the OS keystore or 1Password
- Extras: filesystem snapshots/rollback, cryptographic audit logs, Sigstore attestation of CLAUDE.md/SKILLS.md, profile registry (
nono pull) for Claude Code, Codex, OpenCode and more - Traction: 2,643 stars, 184 forks since January 31, 2026; v0.62.0 (June 7); Apache-2.0
- Caveats: pre-1.0 with 161 open issues; community discussion thinner than the star count suggests; enterprise pricing undefined
- Best for: Multi-agent users on macOS/Linux where API-key exfiltration is the top threat
fence
"Sandbox CLI commands with network/filesystem restrictions."
A Tusk spinout (now its own fencesandbox GitHub org) that openly credits Anthropic's sandbox-runtime as its inspiration — and adds the policy features srt lacks: command deny rules (rm -rf /, git push), SSH command filtering, inbound port exposure, and per-agent templates for Claude Code, Codex, Amp, Gemini CLI, Copilot, OpenCode, and Factory Droid.
- Isolation: Seatbelt (macOS), bubblewrap + socat (Linux) — container-free
- Defaults: outbound network blocked, filesystem writes denied; JSONC config (
fence.jsonc) with template inheritance - Traction: 794 stars and ~60 tagged releases in under six months (created December 18, 2025); Apache-2.0
- Caveats: explicitly not a hostile-code boundary per its own security docs; proxy-based allowlisting fails on non-proxy-aware network stacks; no resource limits
- Best for: Developers who want one prefix command (
fence -t code -- claude) putting default-deny policy around any agent or install script
yolobox
"Let your AI go full send. Your home directory stays home."
The healthiest of the original five. One command (yolobox claude) drops you into a Docker container with your project mounted, all agent CLIs pre-installed and pre-aliased to skip permissions, full sudo inside. Since March it has been pivoting from safety net toward local parallel-agent workbench — fork isolation, clipboard bridge, runtime context manifest, .localhost reverse-proxy integration.
- Isolation: Docker/Podman/Apple Containers; home directory not mounted
- Traction: 603 stars (up from 536 in March), v0.18.4 (June 9, 2026), still shipping roughly weekly, 0 open issues
- Network: full access by default,
no_networkfor air-gapped mode — still no fine-grained allowlists - Best for: Developers who want maximum agent productivity with a simple safety net, and increasingly, parallel local agents
litterbox
"Somewhat isolated development environments."
Still the only tool sandboxing your entire GUI dev environment: Wayland socket forwarding runs editors and agents inside a Podman container, and a custom SSH agent prompts before every key signing. Small but genuinely active — v0.6.0 shipped May 27, 2026, and the repo was pushed June 10.
- Isolation: Podman containers + Landlock (Linux only)
- SSH agent: per-key exposure with confirmation pop-ups before every signing operation
- Traction: 115 stars (up from 66 in March) — slow, steady, early
- Honest disclaimers: documents its own limits (shared kernel, clipboard access, audio risk) better than most tools document features
- Best for: Linux developers who want their whole dev environment sandboxed, not just agents
shai
"Sandboxing shell for AI coding agents."
The "cellular development" pioneer — read-only workspace by default, opt-in write per subdirectory, and Resource Sets (named bundles of allowed HTTP destinations, mounts, ports, env vars) committed to the repo as team policy. The ideas remain genuinely differentiated; the traction hasn't followed.
- Isolation: Docker containers, non-root, read-only mounts
- Status: alive but quiet — maintenance-grade monthly releases (v0.0.11, June 1, 2026), 40 stars essentially unchanged since March, ~455 monthly npm downloads
- Best for: Security-conscious teams who want per-component access control and accept real abandonment risk
yolo-cage
"AI coding agents that can't exfiltrate secrets or merge their own PRs."
The most security-focused architecture of the original wave — egress proxy scanning all HTTP for secrets (sk-ant-, AKIA, ghp_*), branch isolation, blocked gh pr merge/gh repo delete, TruffleHog pre-push scans. But development stopped two days after launch: no commits since February 1, 2026.
- Isolation: Vagrant VM + MicroK8s pods (8GB RAM, 4 CPUs)
- Status: stalled at 110 stars; not archived, but the blocklists and secret patterns are rotting in place
- Best for: Reading the architecture. The egress-proxy-plus-dispatcher design is worth studying; adopting it is not advisable
landrun
"Sandbox any Linux process using Landlock. No root. No containers."
The tool that proved kernel primitives were the right approach — then stopped. landrun wraps any Linux process in Landlock LSM filesystem and TCP rules with near-zero overhead, and at 2,217 stars it validated the entire container-free category. But there have been no commits since October 1, 2025 and no release since v0.1.14 (April 2025), leaving it frozen at Landlock ABI v5 while the kernel advanced to v9.
- Isolation: Landlock LSM (kernel-level, no containers), Linux 5.13+
- Status: stalled; what it does it still does well (enforcement is in the kernel), but nobody is extending it
- Best for: Linux users who want the smallest auditable binary and accept an unmaintained dependency — nono now covers the same ground, actively, on two platforms
Codex Built-in Sandbox
"OS-enforced sandbox that limits what Codex can touch."
OpenAI's first-party answer: Codex CLI runs commands inside a Landlock + seccomp sandbox by default on Linux and Seatbelt on macOS, scoped to the workspace with configurable approval policies. Not a standalone tool — but together with Claude Code's /sandbox, it means both major agents now ship the protection this category was invented to bolt on.
Isolation Technologies Compared
| Technology | Used By | Container Required | Root Required | macOS | Kernel Sharing |
|---|---|---|---|---|---|
| Seatbelt (sandbox-exec) | srt, nono, fence, Codex built-in | No | No | ✅ (macOS-only) | Yes (same process) |
| Landlock LSM | landrun, nono, litterbox, Codex built-in | No | No | ❌ | Yes (same process) |
| bubblewrap | srt, fence | No | No | ❌ | Yes |
| seccomp | Codex built-in | No | No | ❌ | Yes |
| Docker | yolobox, shai | Yes | No (daemon does) | ✅ | Yes |
| Podman | litterbox | Yes | No (rootless) | ❌ | Yes |
| Vagrant/VM | yolo-cage | Yes (VM) | No | ⚠️ | No |
Seatbelt is what makes the new generation cross-platform: srt, nono, and fence all use Apple's built-in sandbox-exec on macOS — nothing to install. The shared risk: Apple has long marked sandbox-exec deprecated.
Landlock is the kernel-native Linux option — irreversible, inherited by children, unprivileged. landrun pioneered it; nono carries it forward actively.
Proxy-based network filtering (srt, nono, fence) is the common new pattern: the OS blocks direct outbound, local HTTP/SOCKS proxies allow listed domains through. It's also the common weakness — srt's docs acknowledge domain fronting and DNS tunneling bypasses.
Containers and VMs still provide the broadest containment (separate filesystem, PID namespace, resource limits) — the case for yolobox's approach — at the cost of a runtime dependency.
Differentiators That Matter Now
With OS-level isolation becoming table stakes, the surviving tools compete on what's layered around it:
| Differentiator | Who Has It | Why It Matters |
|---|---|---|
| Cross-agent support | nono, fence, yolobox | First-party sandboxes serve one agent; most power users run several |
| Credential brokering | nono (injection proxy) | Keys outside the sandbox can't be exfiltrated — structurally stronger than path blocking |
| Command deny rules | fence | Block git push or rm -rf / by policy, not hope |
| SSH protection | fence (command filtering), litterbox (per-key signing prompts) | Git operations without exposing the whole agent socket |
| Parallel-agent workflow | yolobox (fork isolation), nono (multiplexing) | The emerging use case: many agents, one machine |
| Attestation & audit | nono (Sigstore signing of CLAUDE.md, audit logs) | Prompt-injection supply chain is the next threat surface |
| Team-sharable policy | shai (.shai/config.yaml), fence (fence.jsonc), nono (profile registry) | Security policy that travels with the repo |
When to Use What
By Situation
| Situation | Best Tool | Why |
|---|---|---|
| You use Claude Code | Built-in /sandbox (srt) | Already installed; 84% fewer prompts |
| You use Codex | Built-in sandbox | Landlock + seccomp on by default |
| Multiple agents, credential paranoia | nono | One policy across agents; keys never enter the sandbox |
| Wrap anything in default-deny, zero setup | fence | One prefix command, command deny rules, agent templates |
| Maximum agent productivity, simple safety | yolobox | Full sudo in a container, home dir not mounted |
| Entire IDE sandboxed (Linux) | litterbox | Wayland forwarding + confirming SSH agent |
| Per-component team policy | shai | Resource Sets in version control — if you accept the project risk |
By Platform
| Platform | Options |
|---|---|
| macOS | srt, nono, fence (all Seatbelt, container-free); yolobox, shai (containers) |
| Linux | Everything |
| Windows | WSL2 only (srt, nono, fence via WSL); native support is nobody's story yet |
The Built-in Sandbox Question, Answered
The March version of this report asked whether built-in agent sandboxes would make standalone tools unnecessary, and concluded "not yet." Three months later, the answer is: mostly, for single-agent users.
- Claude Code now ships
/sandbox— OS-enforced Seatbelt/bubblewrap isolation with proxy-based network allowlisting, generally available, with managed-settings enforcement for organizations — alongside its permission system - Codex runs Landlock + seccomp sandboxing by default on Linux and Seatbelt on macOS
What's left for standalone tools is real but narrower:
- Cross-agent coverage — first-party sandboxes serve their own agent. If you run Claude Code, Codex, and Gemini CLI, only a third-party tool gives you one policy across all of them.
- Credential brokering — no first-party sandbox keeps API keys outside the agent process; nono's injection proxy does.
- Policy depth — srt has no command deny rules, no SSH filtering, and read-denylist-only file policy; fence and forks exist precisely to fill those gaps.
- Maintenance trust — srt is a "Beta Research Preview" with a known pattern of slow PR responses; some teams will prefer a tool whose only job is sandboxing.
The predicted endgame — built-ins get good enough for most users, standalone tools survive for power users — is no longer a prediction. It's the current market structure.
Market Context
The wincent "Ask HN: The new wave of AI agent sandboxes?" thread cataloged roughly 27 sandbox tools launched within a year — a genuine gold rush. The consolidation since is visible in this report's own membership: of the five tools profiled in March 2026, two are stalled, one is in maintenance mode, and the survivors are the ones that found a differentiated reason to exist beyond "isolation" — which the agent vendors now ship themselves.
The local sandbox category remains distinct from cloud sandboxes (E2B, Daytona, Sprites, Modal):
| Local Sandboxes | Cloud Sandboxes | |
|---|---|---|
| Runs on | Your machine | Remote servers |
| Protects | Your filesystem | Multi-tenant isolation |
| Use case | Individual dev | Production agent infra |
| Billing | Free (open source) | Usage-based |
| Networking | Host network | Isolated network |
| Persistence | Your disk | Ephemeral or managed |
Most developers will use both: local sandbox for development, cloud sandbox for production.
Bottom Line
| Tool | Stars | Status | Best For |
|---|---|---|---|
| anthropic-sandbox-runtime | 4,388 | 🟢 Active (beta) | Claude Code users; the reference architecture |
| nono | 2,643 | 🟢 Healthy | Cross-agent kernel enforcement + credential brokering |
| landrun | 2,217 | 🔴 Stalled (Oct 2025) | Reading; nono absorbed its niche |
| fence | 794 | 🟢 Healthy | Default-deny wrapper with command/SSH policy |
| yolobox | 603 | 🟢 Healthy | Simplest full-auto experience; parallel-agent workbench |
| litterbox | 115 | 🟢 Active | Linux GUI dev environment isolation |
| yolo-cage | 110 | 🔴 Stalled (Feb 2026) | Studying the egress-proxy architecture |
| shai | 40 | 🟡 Quiet | Per-component team policy, with abandonment risk |
| Codex built-in | — | 🟢 First-party | Codex users — it's already on |
The local sandbox space is no longer early and fragmented — it's consolidating, fast. First-party sandboxing from Anthropic and OpenAI made OS-level isolation the default expectation, the container-based wrapper generation is mostly dormant, and the tools still growing (nono, fence, yolobox) each picked a job the built-ins won't do: cross-agent policy, credential brokering, parallel-agent workflows.
The agents didn't get safe enough to make sandboxes unnecessary. The sandboxes got absorbed into the agents — and what survives outside them is the part the agent vendors can't or won't build.
Sources
- [1] yolobox GitHub Repository
- [2] yolo-cage GitHub Repository
- [3] shai GitHub Repository
- [4] litterbox GitHub Repository
- [5] landrun GitHub Repository
- [6] Codex Agent Approvals & Security
- [7] Claude Code Permissions Documentation
- [8] Ask HN: The new wave of AI agent sandboxes?
- [9] Show HN: Yolobox
- [10] Show HN: yolo-cage
- [11] nono GitHub Repository
- [12] Fence GitHub Repository
- [13] Anthropic Sandbox Runtime GitHub Repository
- [14] Anthropic Engineering: Making Claude Code more secure and autonomous with sandboxing