AWS AgentCore Code Interpreter | Ry Walker Research

Key takeaways

Amazon's entry in the AI agent sandbox category: fully managed, microVM-isolated code execution (Python, JavaScript, TypeScript) inside Bedrock AgentCore, generally available since October 13, 2025
Enterprise posture is the differentiator — sessions extendable to 8 hours, 5 GB file transfer via S3, CloudTrail audit trails, VPC/PrivateLink support, IAM-native access control
March 2026 security research (BeyondTrust, Sonrai) showed DNS-based exfiltration escaping "sandbox" network mode and credential-extraction paths via the microVM metadata service — AWS called the DNS behavior intended functionality and pointed customers to VPC mode and DNS firewalls
Consumption pricing at $0.0895/vCPU-hour and $0.00945/GB-hour, billed per second with idle and I/O wait free — no per-sandbox or per-seat fees

FAQ

What is AWS AgentCore Code Interpreter?

A fully managed AWS service that lets AI agents write and execute Python, JavaScript, and TypeScript in isolated microVM sandbox environments, with session state, S3 file access, and CloudTrail logging, as part of the Amazon Bedrock AgentCore platform.

How much does AgentCore Code Interpreter cost?

Consumption-based: $0.0895 per vCPU-hour and $0.00945 per GB-hour of memory, billed per second with a 1-second minimum; I/O wait and idle time are free. AWS's own example prices 30,000 monthly executions at about $109.

Is AgentCore Code Interpreter secure?

Each session runs in a dedicated microVM with compute-based isolation, but 2026 research showed DNS queries escape the default "sandbox" network mode and metadata-service credentials can be extracted; AWS recommends VPC mode, DNS firewalls, and least-privilege execution roles.

How is AgentCore Code Interpreter different from E2B?

E2B is a sandbox-first startup with ~150ms Firecracker cold starts and an open-source core; AgentCore Code Interpreter is a closed AWS service that trades startup-speed benchmarks for IAM, VPC, CloudTrail, and the rest of the AWS enterprise envelope.

Executive Summary

AgentCore Code Interpreter is Amazon's answer to the AI agent sandbox question: a fully managed service inside Amazon Bedrock AgentCore that lets agents write, execute, and debug Python, JavaScript, and TypeScript in isolated environments, with persistent session state for multi-step workflows, file upload/download against S3, and CloudTrail audit trails built in.^[1]^[2] Launched in preview in July 2025, the broader AgentCore platform reached general availability on October 13, 2025, adding VPC, AWS PrivateLink, and CloudFormation support — the enterprise envelope no sandbox startup can match.^[3]

It is also the category's first major public security stress-test. In March 2026, BeyondTrust published research showing DNS queries escape the default "sandbox" network mode — enough to build command-and-control channels and exfiltrate data despite "no network access" — and Sonrai Security demonstrated execution-role credential extraction via the microVM metadata service.^[4]^[5] The episode was logged in the OECD AI Incidents Monitor, and AWS's response — calling the DNS behavior intended functionality and steering customers to VPC mode — is now required reading for anyone buying isolation claims in this category.^[6]^[7]

Attribute	Value
Company	Amazon Web Services
Launched	Preview July 2025; GA October 13, 2025^[3]
Platform	Component of Amazon Bedrock AgentCore (Runtime, Gateway, Identity, Browser, Memory, Observability)^[3]
Adoption signal	AgentCore SDK passed 1M+ downloads by GA^[3]
Open Source	No (service is closed; LangChain integration packages are MIT)^[8]

Product Overview

Code Interpreter is consumed as an API: an agent (or the framework wrapping it) starts a session, streams code into a pre-built runtime, and reads execution results back as streams. Pre-installed libraries cover common data work, and sessions hold state across calls so agents can iterate — write, run, read the error, fix, rerun.^[2] Default execution time is 15 minutes, extendable to 8 hours for long-running jobs.^[2]

Key Capabilities

Capability	Description
Languages	Python, JavaScript, TypeScript pre-built runtimes with common libraries^[2]
Sessions	Persistent state for multi-step workflows; 15-minute default, extendable to 8 hours^[2]
File handling	Inline uploads to 100 MB; S3 transfers via terminal commands to 5 GB^[2]
Network modes	Sandbox (isolated), public (controlled connectivity), and VPC^[1]^[3]
Auditability	CloudTrail integration for compliance audit trails^[2]
Framework integrations	Strands, LangChain, LangGraph, CrewAI native support^[1]

The LangChain ecosystem ships a dedicated Deep Agents backend: langchain-agentcore-codeinterpreter provides AgentCoreSandbox, a SandboxBackendProtocol implementation that wraps Code Interpreter's microVM for Deep Agents command execution and file operations.^[8]

Technical Architecture

Each session runs in a dedicated containerized environment with what AWS calls compute-based session isolation — full workload separation between sessions; the LangChain integration describes the substrate as "a secure, isolated MicroVM environment."^[1]^[8] Sonrai's research confirmed the microVM detail from the inside: the sandbox exposes a MicroVM Metadata Service at 169.254.169.254, mirroring the EC2 IMDS path for execution-role credentials.^[5]

Key Technical Details

Aspect	Detail
Deployment	Fully managed AWS service; no infrastructure to run^[1]
Isolation	Per-session microVM, compute-based session isolation^[1]^[5]
Identity	IAM execution roles; credentials served via metadata service^[5]
Integrations	S3, CloudTrail, AgentCore Gateway/Identity/Observability; Strands, LangChain, LangGraph, CrewAI^[1]
Open Source	Closed service; MIT-licensed LangChain adapter^[8]

Strengths

The AWS enterprise envelope — IAM, VPC, PrivateLink, CloudFormation, resource tagging, and CloudTrail audit trails arrived at GA; no sandbox startup offers this compliance surface.^[3]
Long-running sessions — 8-hour extendable execution windows suit data jobs that ephemeral-sandbox competitors time out on.^[2]
Gigabyte-scale data paths — referencing files in S3 enables processing at gigabyte scale without API limits; 5 GB transfers via terminal commands.^[2]
Idle time is free — billing counts actual CPU consumption per second; I/O wait and idle time cost nothing absent background processes, a meaningful saving for agent workloads that mostly wait on the model.^[9]
Framework-neutral — native Strands, LangChain, LangGraph, and CrewAI support, plus a purpose-built Deep Agents sandbox backend on PyPI.^[1]^[8]

Cautions

"Sandbox" network mode leaked DNS — BeyondTrust's Kinnaird McQuade showed outbound DNS queries escape the no-network configuration, enabling threat actors "to establish command-and-control channels and data exfiltration over DNS in certain scenarios, bypassing the expected network isolation controls" — bidirectional C2, interactive reverse shells, and S3 exfiltration where IAM allowed it. Disclosed to AWS September 2025, published March 17, 2026, rated CVSS 7.5 with no CVE assigned.^[4]^[7]
AWS called it intended functionality, not a defect — the official remedy is architectural: migrate to VPC mode for complete isolation and add a DNS firewall. Buyers should treat "sandbox" mode's isolation claims accordingly.^[7]^[6]
Credential extraction was a string filter away — Sonrai showed the metadata-service block only filtered literal strings like ://169.254.169.254; splitting the request across commands bypassed it, yielding execution-role credentials usable outside the sandbox against the AWS control plane. AWS framed the behavior as shared-responsibility and published credentials-management guidance.^[5]
Bedrock-shaped onboarding — Code Interpreter lives inside the AgentCore platform; teams not already on AWS inherit IAM roles, execution policies, and AgentCore concepts before the first sandbox runs.^[2]
No published cold-start numbers — AWS publishes no sandbox creation latency, while competitors compete openly on milliseconds; performance-sensitive buyers must benchmark themselves.^[2]

What Developers Say

Community discussion targets the AgentCore platform broadly more than Code Interpreter specifically — there is no substantive Hacker News thread dedicated to the Code Interpreter component as of June 2026.^[10] Platform-level sentiment splits:

"agentcore makes running strands frameworks pretty easy and relatively inexpensive" — an HN commenter^[10]

"AgentCore runtime sucks and is expensive... nobody is solving for self-hosted managed infra for agents" — an HN commenter^[10]

"an invisible token limit kicked in... the end to end ceremony of using the Bedrock Agent Core Starter toolkit... such an ordeal" — an HN commenter on AgentCore onboarding^[10]

"Strands... and Agent Core... sometimes they even feel at odds with each other" — an HN commenter on AWS's overlapping agent stack^[10]

The deepest community-adjacent commentary is the security research itself — BeyondTrust and Sonrai are, in effect, the category's first independent red team, and their published findings are more load-bearing than any forum thread.^[4]^[5]

Pricing & Licensing

Tier	Price	Includes
Consumption (only tier)	$0.0895 per vCPU-hour + $0.00945 per GB-hour memory	Per-second billing, 1-second minimum, 128 MB memory floor; I/O wait and idle time free^[9]

AWS's worked example: 10,000 monthly requests at 3 executions each (30,000 executions), 2-minute sessions with 60% I/O wait, 2 vCPU active and 4 GB memory, comes to roughly $109.40/month.^[9] Network data transfer bills at standard EC2 rates.^[9]

Licensing model: Proprietary managed service; no upfront commitments or minimum fees. The langchain-agentcore-codeinterpreter adapter is MIT.^[1]^[8]

Hidden costs: Data transfer at EC2 rates, the surrounding AgentCore components (Runtime, Gateway, Memory each bill separately), and the engineering cost of VPC mode plus DNS firewalls if you take the security research seriously.^[9]^[7]

Competitive Positioning

Direct Competitors

Competitor	Differentiation
E2B	Sandbox-first startup, open-source core, ~150ms Firecracker starts; Code Interpreter counters with IAM/VPC/CloudTrail and 8-hour sessions
Modal	gVisor serverless compute with GPUs and a Python-native developer experience; Code Interpreter is narrower but native to the Bedrock agent stack
Northflank	Full platform (microVM/gVisor) spanning sandboxes and production workloads, BYOC across clouds; Code Interpreter is AWS-only and agent-tool-shaped
Daytona / Runloop	Persistent dev-environment sandboxes; Code Interpreter sessions are bounded at 8 hours and not designed as durable workspaces

When to Choose AgentCore Code Interpreter Over Alternatives

Choose Code Interpreter when: the org is already on AWS, compliance demands CloudTrail/VPC/IAM-native controls, or agents need 8-hour executions against gigabyte-scale S3 data.
Choose E2B when: cold-start latency, open source, and multi-cloud neutrality matter more than the AWS envelope.
Choose Modal when: workloads need GPUs or general serverless compute beyond agent tool-calls.
Choose Northflank when: sandboxes should live next to production services on one platform, possibly in your own cloud account.

Ideal Customer Profile

Best fit:

Enterprises already standardized on AWS and Bedrock that need agent code execution inside existing IAM/VPC governance
Teams running Strands, LangChain/LangGraph, CrewAI, or Deep Agents who want a managed sandbox backend without new vendors
Workloads with long executions (up to 8 hours) over large S3 datasets
Compliance-driven buyers who need CloudTrail audit trails on every execution

Poor fit:

Latency-sensitive products that need published sub-second cold starts (E2B, Zeroboot-class options)
Multi-cloud or cloud-neutral architectures
Teams wanting open-source, self-hostable sandboxes they can audit
Anyone relying on default "sandbox" network mode as a hard security boundary without VPC mode and DNS controls

Viability Assessment

Factor	Assessment
Financial Health	Amazon — no vendor-viability risk in the conventional sense
Market Position	The default sandbox for AWS-committed enterprises; 1M+ AgentCore SDK downloads by GA^[3]
Innovation Pace	Preview to GA in ~3 months; AgentCore added Policy and Evaluations previews by December 2025^[3]
Community/Ecosystem	Strands/LangChain/LangGraph/CrewAI integrations plus a Deep Agents backend; thin independent community discussion^[8]^[10]
Long-term Outlook	Secure as an AWS platform component; the open question is trust in isolation claims, not survival

Viability here is not the question — Amazon will run this service indefinitely. The question is whether the security posture matches the marketing: the 2026 research showed the gap between "isolated sandbox" and what the default configuration actually guaranteed, and AWS's "intended functionality" response shifts the hardening burden onto customers.^[4]^[5]

Bottom Line

AgentCore Code Interpreter is the enterprise gravity-well of the AI agent sandbox category: nothing else pairs managed Python/JS/TS execution with IAM, VPC, PrivateLink, CloudTrail, 8-hour sessions, and idle-time-free per-second billing. But it is also the category's cautionary tale — the first managed agent sandbox to be publicly stress-tested, and the default "sandbox" network mode failed that test on DNS exfiltration and metadata-credential paths. Run it in VPC mode with DNS filtering and least-privilege execution roles, and it is a strong default for AWS shops; run it on defaults and you are trusting a boundary that independent researchers have already walked through.

Recommended for: AWS-committed enterprises needing compliant, auditable, long-running agent code execution wired into Bedrock, Strands, or the LangChain ecosystem.

Not recommended for: Cloud-neutral teams, latency-benchmark buyers, open-source-required environments, or anyone unwilling to do the VPC/DNS hardening the 2026 research made table stakes.

Outlook: Amazon's distribution makes this the volume leader for enterprise agent sandboxes; the BeyondTrust/Sonrai episode will be remembered as the moment the category's isolation claims started getting audited.

Research by Ry Walker Research • methodology

Sources