Key takeaways
- Containers share the host kernel — one escape compromises everything. These tools add VM-level isolation while keeping container-like developer experience
- Three architectural approaches: microVMs (Firecracker, Cloud Hypervisor), userspace kernels (gVisor), and unikernels (Unikraft) — each with different tradeoffs on compatibility, performance, and security
- AI agent sandboxing is the new killer use case — running untrusted LLM-generated code requires hard VM boundaries, not shared-kernel containers
- Developer experience ranges from low-level VMM APIs (Firecracker) to Docker-like CLIs (Hypeman) to Kubernetes-native runtimes (Kata Containers)
FAQ
Why aren't containers enough for isolation?
Containers share the host kernel. A kernel exploit in one container can compromise the entire host and all other containers on it. VMs provide a hardware-enforced boundary — each workload gets its own kernel, so a compromise is contained.
What is the difference between a microVM and a unikernel?
MicroVMs (Firecracker, Cloud Hypervisor) run a full Linux kernel in a lightweight VM with a minimal device model. Unikernels (Unikraft) compile the application and only the needed OS components into a single binary — smaller footprint but less compatibility.
Which tool should I use for AI agent sandboxing?
For the quickest start, Hypeman gives you Docker-like UX with VM isolation. For Kubernetes environments, Kata Containers. For maximum density at scale, Firecracker. For defense-in-depth without VMs, gVisor.
How fast do microVMs boot?
Firecracker boots in ~125ms. Cloud Hypervisor is similar. Hypeman adds standby/restore for near-instant resume from snapshots. Traditional VMs take seconds to minutes.
Executive Summary
Containers revolutionized deployment but they share a fatal flaw: the host kernel is a shared attack surface. One container escape compromises everything. As workloads get more adversarial — especially AI agents executing untrusted code — the industry needs VM-level isolation with container-level developer experience.
This comparison covers six tools solving that problem, each with a different architectural approach:
| Tool | Stars | Language | License | Approach | DX Level |
|---|---|---|---|---|---|
| Firecracker | 33K | Rust | Apache-2.0 | MicroVM VMM | Low (API) |
| gVisor | 18K | Go | Apache-2.0 | Userspace kernel | Medium (OCI) |
| Kata Containers | 7.6K | Rust | Apache-2.0 | OCI runtime + VM | Medium (K8s) |
| Cloud Hypervisor | 5.4K | Rust | Apache-2.0/BSD | Modern VMM | Low (API) |
| Unikraft | ~4K | C | BSD-3 | Unikernel | Low (custom build) |
| Hypeman | 62 | Go | MIT | Multi-hypervisor CLI | High (Docker-like) |
Key finding: The market is splitting by use case. Firecracker dominates serverless (AWS Lambda/Fargate). Kata Containers owns Kubernetes. gVisor provides defense-in-depth without full VMs. And newcomers like Hypeman are targeting the developer experience gap — making VM isolation as easy as docker run.
The Problem: Shared Kernel = Shared Risk
Standard containers (Docker, containerd) use Linux namespaces and cgroups for isolation. This is process-level isolation, not machine-level. Every container on a host shares the same kernel, and the Linux kernel has ~30 million lines of code with a large syscall surface.
What goes wrong:
- Kernel exploits (CVE-2024-1086, CVE-2022-0185) can escape containers entirely
- Privileged containers or misconfigurations weaken isolation further
- seccomp and AppArmor reduce attack surface but don't eliminate it
Why it matters now: AI agents running LLM-generated code are the new threat model. You cannot trust code written by a model — it might install packages, open network connections, or exploit kernel bugs. VM boundaries provide hardware-enforced isolation that containers cannot.
Architectural Approaches
MicroVMs (Firecracker, Cloud Hypervisor)
MicroVMs strip traditional VMs down to the essentials. Firecracker provides ~30 emulated devices (vs hundreds in QEMU), boots in ~125ms, and uses ~5MB of memory overhead per VM. Cloud Hypervisor takes a similar approach with broader hardware support (Intel, ARM, and experimental RISC-V).
Tradeoffs: Full Linux compatibility inside the VM. Hardware-enforced isolation. But you need to build your own orchestration — these are VMMs, not platforms.
Userspace Kernel (gVisor)
gVisor intercepts application syscalls and re-implements them in a userspace kernel called Sentry. It doesn't run a full VM — instead, it acts as a compatibility layer that limits the application's access to the host kernel.
Tradeoffs: Lower overhead than full VMs. Easy to deploy as an OCI runtime. But not all syscalls are implemented, so compatibility is less than 100%. Not a true VM boundary — a determined attacker with a gVisor exploit reaches the host kernel.
Unikernels (Unikraft)
Unikraft compiles only the OS components an application needs into a single bootable image. No shell, no unnecessary drivers, no multi-user support. The result is tiny images (sometimes less than 1MB) with minimal attack surface.
Tradeoffs: Extremely small footprint and fast boot. But building unikernel images requires specialized tooling, debugging is harder, and you lose general-purpose Linux compatibility.
Multi-Hypervisor Orchestration (Hypeman)
Hypeman takes a different approach — instead of being a VMM itself, it sits above multiple hypervisors (Firecracker, Cloud Hypervisor, QEMU, Apple Virtualization.framework) and provides a Docker-like CLI. You get VM isolation with hypeman pull and hypeman run.
Tradeoffs: Highest developer experience. But it's early (62 stars) and adds a layer of abstraction over the underlying VMMs.
Detailed Comparison
Performance
| Metric | Firecracker | gVisor | Kata | Cloud Hypervisor | Hypeman | Unikraft |
|---|---|---|---|---|---|---|
| Boot time | ~125ms | N/A (process) | ~500ms | ~100ms | Varies by backend | ~1ms |
| Memory overhead | ~5MB | ~50MB | ~40MB | ~5MB | Varies by backend | |
| Density (per host) | Thousands | Hundreds | Hundreds | Thousands | Hundreds | Thousands |
| Syscall compat | Full Linux | ~70-80% | Full Linux | Full Linux | Full Linux | App-specific |
Security Model
| Feature | Firecracker | gVisor | Kata | Cloud Hypervisor | Hypeman | Unikraft |
|---|---|---|---|---|---|---|
| Isolation type | Hardware VM | Userspace kernel | Hardware VM | Hardware VM | Hardware VM | Hardware VM |
| Kernel shared? | No | Partially | No | No | No | No |
| seccomp | Yes (jailer) | Built-in | Yes | Yes | Depends on backend | N/A |
| Attack surface | Minimal VMM | Sentry (~20K LoC) | VMM + agent | Minimal VMM | Backend VMM | Minimal unikernel |
Developer Experience
| Feature | Firecracker | gVisor | Kata | Cloud Hypervisor | Hypeman | Unikraft |
|---|---|---|---|---|---|---|
| CLI UX | REST API | runsc (OCI) | K8s runtime class | REST API | Docker-like | kraft CLI |
| K8s integration | DIY | RuntimeClass | Native | DIY | No | KraftCloud |
| OCI images | No (rootfs) | Yes | Yes | No (rootfs) | Yes | No (unikernel) |
| Snapshotting | Yes | No | No | Yes | Yes (standby) | No |
| macOS support | No | No | No | No | Yes (Apple Vz) | No |
The AI Agent Angle
AI agent sandboxing is the fastest-growing use case for container-to-VM runtimes. The threat model is clear: LLMs generate code, agents execute it, and you have no idea what that code will do.
What agents need:
- Fast boot — agents spin up and tear down environments constantly
- Full Linux compat — agents install packages, run builds, use system tools
- Network isolation — contain outbound connections from untrusted code
- Snapshotting — save state for checkpointing and replay
Who's winning this use case:
- Firecracker powers E2B's sandbox platform (200M+ sandboxes)
- Hypeman powers Kernel's browser isolation for AI agents
- gVisor powers Google Cloud Run and GKE Sandbox
- Kata Containers used in multi-tenant Kubernetes clusters
The pattern emerging: higher-level platforms (E2B, Kernel, Sprites) build on lower-level runtimes (Firecracker, Cloud Hypervisor) and expose developer-friendly APIs. The VMM layer is becoming infrastructure — the value moves up the stack.
Choosing the Right Tool
You want maximum density at scale → Firecracker. It's battle-tested at AWS scale (Lambda, Fargate) and boots in ~125ms with ~5MB overhead.
You want Kubernetes-native VM isolation → Kata Containers. Drop-in OCI runtime that runs containers in VMs without changing your K8s workflow.
You want defense-in-depth without full VMs → gVisor. Lower overhead, easier deployment, but not a true hardware boundary.
You want Docker-like simplicity with VM isolation → Hypeman. Highest DX, multi-hypervisor, but early-stage.
You want minimal attack surface → Unikraft. Smallest images, fastest boot, but requires specialized build tooling.
You want a modern VMM to build on → Cloud Hypervisor. Clean Rust codebase, active Intel/ARM community.
See Also
- Hypeman — Docker-like UX for running containers in VMs
- AI Agent Sandboxes Compared — higher-level sandbox platforms built on these runtimes
- E2B — ephemeral sandbox platform (uses Firecracker)
Research by Ry Walker Research
Sources
- [1] Firecracker GitHub Repository
- [2] gVisor GitHub Repository
- [3] Kata Containers GitHub Repository
- [4] Cloud Hypervisor GitHub Repository
- [5] Hypeman GitHub Repository
- [6] Unikraft GitHub Repository
- [7] Firecracker: Lightweight Virtualization for Serverless Applications (NSDI 2020)
- [8] Show HN: Hypeman — Run containers in VMs