Key takeaways
- Amazon's entry in the AI agent sandbox category: fully managed, microVM-isolated code execution (Python, JavaScript, TypeScript) inside Bedrock AgentCore, generally available since October 13, 2025
- Enterprise posture is the differentiator — sessions extendable to 8 hours, 5 GB file transfer via S3, CloudTrail audit trails, VPC/PrivateLink support, IAM-native access control
- March 2026 security research (BeyondTrust, Sonrai) showed DNS-based exfiltration escaping "sandbox" network mode and credential-extraction paths via the microVM metadata service — AWS called the DNS behavior intended functionality and pointed customers to VPC mode and DNS firewalls
- Consumption pricing at $0.0895/vCPU-hour and $0.00945/GB-hour, billed per second with idle and I/O wait free — no per-sandbox or per-seat fees
FAQ
What is AWS AgentCore Code Interpreter?
A fully managed AWS service that lets AI agents write and execute Python, JavaScript, and TypeScript in isolated microVM sandbox environments, with session state, S3 file access, and CloudTrail logging, as part of the Amazon Bedrock AgentCore platform.
How much does AgentCore Code Interpreter cost?
Consumption-based: $0.0895 per vCPU-hour and $0.00945 per GB-hour of memory, billed per second with a 1-second minimum; I/O wait and idle time are free. AWS's own example prices 30,000 monthly executions at about $109.
Is AgentCore Code Interpreter secure?
Each session runs in a dedicated microVM with compute-based isolation, but 2026 research showed DNS queries escape the default "sandbox" network mode and metadata-service credentials can be extracted; AWS recommends VPC mode, DNS firewalls, and least-privilege execution roles.
How is AgentCore Code Interpreter different from E2B?
E2B is a sandbox-first startup with ~150ms Firecracker cold starts and an open-source core; AgentCore Code Interpreter is a closed AWS service that trades startup-speed benchmarks for IAM, VPC, CloudTrail, and the rest of the AWS enterprise envelope.
Executive Summary
AgentCore Code Interpreter is Amazon's answer to the AI agent sandbox question: a fully managed service inside Amazon Bedrock AgentCore that lets agents write, execute, and debug Python, JavaScript, and TypeScript in isolated environments, with persistent session state for multi-step workflows, file upload/download against S3, and CloudTrail audit trails built in.[1][2] Launched in preview in July 2025, the broader AgentCore platform reached general availability on October 13, 2025, adding VPC, AWS PrivateLink, and CloudFormation support — the enterprise envelope no sandbox startup can match.[3]
It is also the category's first major public security stress-test. In March 2026, BeyondTrust published research showing DNS queries escape the default "sandbox" network mode — enough to build command-and-control channels and exfiltrate data despite "no network access" — and Sonrai Security demonstrated execution-role credential extraction via the microVM metadata service.[4][5] The episode was logged in the OECD AI Incidents Monitor, and AWS's response — calling the DNS behavior intended functionality and steering customers to VPC mode — is now required reading for anyone buying isolation claims in this category.[6][7]
| Attribute | Value |
|---|---|
| Company | Amazon Web Services |
| Launched | Preview July 2025; GA October 13, 2025[3] |
| Platform | Component of Amazon Bedrock AgentCore (Runtime, Gateway, Identity, Browser, Memory, Observability)[3] |
| Adoption signal | AgentCore SDK passed 1M+ downloads by GA[3] |
| Open Source | No (service is closed; LangChain integration packages are MIT)[8] |
Product Overview
Code Interpreter is consumed as an API: an agent (or the framework wrapping it) starts a session, streams code into a pre-built runtime, and reads execution results back as streams. Pre-installed libraries cover common data work, and sessions hold state across calls so agents can iterate — write, run, read the error, fix, rerun.[2] Default execution time is 15 minutes, extendable to 8 hours for long-running jobs.[2]
Key Capabilities
| Capability | Description |
|---|---|
| Languages | Python, JavaScript, TypeScript pre-built runtimes with common libraries[2] |
| Sessions | Persistent state for multi-step workflows; 15-minute default, extendable to 8 hours[2] |
| File handling | Inline uploads to 100 MB; S3 transfers via terminal commands to 5 GB[2] |
| Network modes | Sandbox (isolated), public (controlled connectivity), and VPC[1][3] |
| Auditability | CloudTrail integration for compliance audit trails[2] |
| Framework integrations | Strands, LangChain, LangGraph, CrewAI native support[1] |
The LangChain ecosystem ships a dedicated Deep Agents backend: langchain-agentcore-codeinterpreter provides AgentCoreSandbox, a SandboxBackendProtocol implementation that wraps Code Interpreter's microVM for Deep Agents command execution and file operations.[8]
Technical Architecture
Each session runs in a dedicated containerized environment with what AWS calls compute-based session isolation — full workload separation between sessions; the LangChain integration describes the substrate as "a secure, isolated MicroVM environment."[1][8] Sonrai's research confirmed the microVM detail from the inside: the sandbox exposes a MicroVM Metadata Service at 169.254.169.254, mirroring the EC2 IMDS path for execution-role credentials.[5]
Key Technical Details
| Aspect | Detail |
|---|---|
| Deployment | Fully managed AWS service; no infrastructure to run[1] |
| Isolation | Per-session microVM, compute-based session isolation[1][5] |
| Identity | IAM execution roles; credentials served via metadata service[5] |
| Integrations | S3, CloudTrail, AgentCore Gateway/Identity/Observability; Strands, LangChain, LangGraph, CrewAI[1] |
| Open Source | Closed service; MIT-licensed LangChain adapter[8] |
Strengths
- The AWS enterprise envelope — IAM, VPC, PrivateLink, CloudFormation, resource tagging, and CloudTrail audit trails arrived at GA; no sandbox startup offers this compliance surface.[3]
- Long-running sessions — 8-hour extendable execution windows suit data jobs that ephemeral-sandbox competitors time out on.[2]
- Gigabyte-scale data paths — referencing files in S3 enables processing at gigabyte scale without API limits; 5 GB transfers via terminal commands.[2]
- Idle time is free — billing counts actual CPU consumption per second; I/O wait and idle time cost nothing absent background processes, a meaningful saving for agent workloads that mostly wait on the model.[9]
- Framework-neutral — native Strands, LangChain, LangGraph, and CrewAI support, plus a purpose-built Deep Agents sandbox backend on PyPI.[1][8]
Cautions
- "Sandbox" network mode leaked DNS — BeyondTrust's Kinnaird McQuade showed outbound DNS queries escape the no-network configuration, enabling threat actors "to establish command-and-control channels and data exfiltration over DNS in certain scenarios, bypassing the expected network isolation controls" — bidirectional C2, interactive reverse shells, and S3 exfiltration where IAM allowed it. Disclosed to AWS September 2025, published March 17, 2026, rated CVSS 7.5 with no CVE assigned.[4][7]
- AWS called it intended functionality, not a defect — the official remedy is architectural: migrate to VPC mode for complete isolation and add a DNS firewall. Buyers should treat "sandbox" mode's isolation claims accordingly.[7][6]
- Credential extraction was a string filter away — Sonrai showed the metadata-service block only filtered literal strings like
://169.254.169.254; splitting the request across commands bypassed it, yielding execution-role credentials usable outside the sandbox against the AWS control plane. AWS framed the behavior as shared-responsibility and published credentials-management guidance.[5] - Bedrock-shaped onboarding — Code Interpreter lives inside the AgentCore platform; teams not already on AWS inherit IAM roles, execution policies, and AgentCore concepts before the first sandbox runs.[2]
- No published cold-start numbers — AWS publishes no sandbox creation latency, while competitors compete openly on milliseconds; performance-sensitive buyers must benchmark themselves.[2]
What Developers Say
Community discussion targets the AgentCore platform broadly more than Code Interpreter specifically — there is no substantive Hacker News thread dedicated to the Code Interpreter component as of June 2026.[10] Platform-level sentiment splits:
"agentcore makes running strands frameworks pretty easy and relatively inexpensive" — an HN commenter[10]
"AgentCore runtime sucks and is expensive... nobody is solving for self-hosted managed infra for agents" — an HN commenter[10]
"an invisible token limit kicked in... the end to end ceremony of using the Bedrock Agent Core Starter toolkit... such an ordeal" — an HN commenter on AgentCore onboarding[10]
"Strands... and Agent Core... sometimes they even feel at odds with each other" — an HN commenter on AWS's overlapping agent stack[10]
The deepest community-adjacent commentary is the security research itself — BeyondTrust and Sonrai are, in effect, the category's first independent red team, and their published findings are more load-bearing than any forum thread.[4][5]
Pricing & Licensing
| Tier | Price | Includes |
|---|---|---|
| Consumption (only tier) | $0.0895 per vCPU-hour + $0.00945 per GB-hour memory | Per-second billing, 1-second minimum, 128 MB memory floor; I/O wait and idle time free[9] |
AWS's worked example: 10,000 monthly requests at 3 executions each (30,000 executions), 2-minute sessions with 60% I/O wait, 2 vCPU active and 4 GB memory, comes to roughly $109.40/month.[9] Network data transfer bills at standard EC2 rates.[9]
Licensing model: Proprietary managed service; no upfront commitments or minimum fees. The langchain-agentcore-codeinterpreter adapter is MIT.[1][8]
Hidden costs: Data transfer at EC2 rates, the surrounding AgentCore components (Runtime, Gateway, Memory each bill separately), and the engineering cost of VPC mode plus DNS firewalls if you take the security research seriously.[9][7]
Competitive Positioning
Direct Competitors
| Competitor | Differentiation |
|---|---|
| E2B | Sandbox-first startup, open-source core, ~150ms Firecracker starts; Code Interpreter counters with IAM/VPC/CloudTrail and 8-hour sessions |
| Modal | gVisor serverless compute with GPUs and a Python-native developer experience; Code Interpreter is narrower but native to the Bedrock agent stack |
| Northflank | Full platform (microVM/gVisor) spanning sandboxes and production workloads, BYOC across clouds; Code Interpreter is AWS-only and agent-tool-shaped |
| Daytona / Runloop | Persistent dev-environment sandboxes; Code Interpreter sessions are bounded at 8 hours and not designed as durable workspaces |
When to Choose AgentCore Code Interpreter Over Alternatives
- Choose Code Interpreter when: the org is already on AWS, compliance demands CloudTrail/VPC/IAM-native controls, or agents need 8-hour executions against gigabyte-scale S3 data.
- Choose E2B when: cold-start latency, open source, and multi-cloud neutrality matter more than the AWS envelope.
- Choose Modal when: workloads need GPUs or general serverless compute beyond agent tool-calls.
- Choose Northflank when: sandboxes should live next to production services on one platform, possibly in your own cloud account.
Ideal Customer Profile
Best fit:
- Enterprises already standardized on AWS and Bedrock that need agent code execution inside existing IAM/VPC governance
- Teams running Strands, LangChain/LangGraph, CrewAI, or Deep Agents who want a managed sandbox backend without new vendors
- Workloads with long executions (up to 8 hours) over large S3 datasets
- Compliance-driven buyers who need CloudTrail audit trails on every execution
Poor fit:
- Latency-sensitive products that need published sub-second cold starts (E2B, Zeroboot-class options)
- Multi-cloud or cloud-neutral architectures
- Teams wanting open-source, self-hostable sandboxes they can audit
- Anyone relying on default "sandbox" network mode as a hard security boundary without VPC mode and DNS controls
Viability Assessment
| Factor | Assessment |
|---|---|
| Financial Health | Amazon — no vendor-viability risk in the conventional sense |
| Market Position | The default sandbox for AWS-committed enterprises; 1M+ AgentCore SDK downloads by GA[3] |
| Innovation Pace | Preview to GA in ~3 months; AgentCore added Policy and Evaluations previews by December 2025[3] |
| Community/Ecosystem | Strands/LangChain/LangGraph/CrewAI integrations plus a Deep Agents backend; thin independent community discussion[8][10] |
| Long-term Outlook | Secure as an AWS platform component; the open question is trust in isolation claims, not survival |
Viability here is not the question — Amazon will run this service indefinitely. The question is whether the security posture matches the marketing: the 2026 research showed the gap between "isolated sandbox" and what the default configuration actually guaranteed, and AWS's "intended functionality" response shifts the hardening burden onto customers.[4][5]
Bottom Line
AgentCore Code Interpreter is the enterprise gravity-well of the AI agent sandbox category: nothing else pairs managed Python/JS/TS execution with IAM, VPC, PrivateLink, CloudTrail, 8-hour sessions, and idle-time-free per-second billing. But it is also the category's cautionary tale — the first managed agent sandbox to be publicly stress-tested, and the default "sandbox" network mode failed that test on DNS exfiltration and metadata-credential paths. Run it in VPC mode with DNS filtering and least-privilege execution roles, and it is a strong default for AWS shops; run it on defaults and you are trusting a boundary that independent researchers have already walked through.
Recommended for: AWS-committed enterprises needing compliant, auditable, long-running agent code execution wired into Bedrock, Strands, or the LangChain ecosystem.
Not recommended for: Cloud-neutral teams, latency-benchmark buyers, open-source-required environments, or anyone unwilling to do the VPC/DNS hardening the 2026 research made table stakes.
Outlook: Amazon's distribution makes this the volume leader for enterprise agent sandboxes; the BeyondTrust/Sonrai episode will be remembered as the moment the category's isolation claims started getting audited.
Research by Ry Walker Research • methodology
Sources
- [1] AWS ML Blog: Introducing the Amazon Bedrock AgentCore Code Interpreter
- [2] AWS Docs: Execute code using Amazon Bedrock AgentCore Code Interpreter
- [3] AWS What's New: Amazon Bedrock AgentCore is now generally available
- [4] BeyondTrust: Pwning AI Code Interpreters in AWS Bedrock AgentCore
- [5] Sonrai Security: Sandboxed to Compromised — Credential Exfiltration Paths in AWS Code Interpreters
- [6] OECD AI Incidents Monitor: AWS Bedrock AgentCore sandbox bypass (2026-03-16)
- [7] The Hacker News: AI Flaws in Amazon Bedrock, LangSmith, and SGLang
- [8] PyPI: langchain-agentcore-codeinterpreter (Deep Agents sandbox backend)
- [9] Amazon Bedrock AgentCore Pricing
- [10] AgentCore mentions on Hacker News (Algolia search)