Every AI agent that generates and runs code is executing instructions that no human has reviewed. A single prompt injection can turn a helpful coding assistant into an attacker running shell commands with your credentials. Sandboxing is the control that prevents this: isolate the execution environment so that even fully compromised code cannot reach your host system, your network, or your data.
Three technologies dominate AI agent sandboxing in 2026: MicroVMs (Firecracker, Kata Containers), gVisor (Google’s user-space kernel), and WebAssembly. Each offers a different balance of isolation strength, startup speed, and compatibility. Docker containers, the default for most developers, are explicitly not a security boundary for untrusted code.
This guide compares all four approaches with real numbers, names the platforms built on each, and covers what OWASP and NVIDIA’s AI Red Team recommend for production agent deployments.
Why Containers Are Not Enough
Most AI agents today run generated code inside Docker containers. It feels safe: the process is isolated from the host filesystem, it has its own network namespace, and cgroups cap resource usage. But containers share the host kernel. A kernel exploit in container-executed code gives the attacker root on the host machine.
This is not theoretical. Trend Micro researchers demonstrated multiple weaknesses in ChatGPT’s Docker-based code interpreter sandbox. By uploading a malicious Excel file, they injected persistent background processes that monitored the /mnt/data directory for user-uploaded documents and replaced hyperlinks with phishing URLs. Dynamic code obfuscation via base64 encoding evaded detection. OpenAI patched the specific flaw in December 2024, but the architectural weakness remains: containers trust the host kernel.
As Rivet’s reverse-engineering analysis puts it: “most of the industry agrees that containers are a bad practice for untrusted code execution.”
The OWASP Top 10 for Agentic Applications, released December 2025, codified this in ASI05 (Unexpected Code Execution): “Software-only sandboxing is insufficient; all code generated by an LLM must be executed in a secure, isolated sandbox environment with no access to the underlying host system.” That means hardware-enforced isolation or, at minimum, a user-space kernel that intercepts syscalls before they reach the host.
MicroVMs: The Hardware-Enforced Option
MicroVMs give each workload its own lightweight virtual machine with a dedicated kernel. The host kernel is never exposed to guest code. AWS built Firecracker for exactly this purpose: it powers Lambda and Fargate, running millions of untrusted workloads per second.
The numbers are compelling. Firecracker boots to user-space code in roughly 125ms with less than 5 MiB of memory overhead per VM. A single host can spawn up to 150 VMs per second. Kata Containers offers similar hardware-enforced isolation with Kubernetes-native orchestration at roughly 200ms boot time.
Who Uses MicroVMs for AI Agents
E2B built their entire platform on Firecracker. Each sandbox is a MicroVM with roughly 150ms cold start and configurable sessions up to 24 hours. E2B uses a snapshot/restore model: Dockerfiles build a pre-configured VM image, then new sandboxes restore from that snapshot. The project has 8,900+ GitHub stars and native SDKs for Python and TypeScript.
Docker Sandboxes (Docker Desktop 4.58, January 2026) moved from containers to MicroVMs for AI agent isolation. Rivet’s reverse engineering revealed the architecture: a sandboxd daemon exposes a /vm API over a Unix socket at ~/.docker/sandboxes/sandboxd.sock. Each sandbox gets its own isolated Docker daemon (not sharing /var/run/docker.sock), with network traffic routed through a filtering proxy that performs MITM TLS inspection. Docker Sandboxes support Claude Code, Codex CLI, Copilot CLI, Gemini CLI, and Kiro.
Fly.io Machines wraps Firecracker in a REST API for ephemeral machines with sub-second spin-up. You create a machine, run your agent’s code, and destroy it in a single API call.
The trade-off with MicroVMs is compatibility. They require KVM support (Linux only for self-hosted; Docker Desktop uses Apple Virtualization.framework on macOS and Hyper-V on Windows). Image management lacks Docker’s shared layers, so disk usage grows linearly with VM count. But for untrusted code from AI agents, the security model is unmatched.
gVisor: The Syscall Firewall
Google’s gVisor takes a different approach. Instead of virtualizing hardware, it interposes a user-space kernel (called “Sentry”) between the guest process and the host kernel. Every system call from the sandboxed process is intercepted, validated, and either handled in user space or proxied to the host through a minimal, vetted interface.
Startup time is 50-100ms. For CPU-bound workloads, overhead is near zero. I/O-heavy workloads see 10-30% overhead, though recent optimizations have narrowed this gap: a rootfs overlay halved sandboxing overhead for builds, and Directfs reduced workload time by 12%.
gVisor does not provide hardware-enforced isolation. A sufficiently sophisticated kernel exploit could theoretically escape Sentry. But it drastically reduces the kernel attack surface: instead of exposing hundreds of syscalls, gVisor implements only the subset it can verify. Google trusts it enough to run Cloud Run and App Engine on gVisor for multi-tenant isolation.
Who Uses gVisor for AI Agents
Modal runs AI agent workloads in gVisor containers with sub-second cold starts and autoscaling up to 20,000+ concurrent containers. Lovable uses Modal at massive scale for executing LLM-generated code. Modal’s pricing starts at $0.047/vCPU-hour and $0.008/GB-hour.
Northflank offers both gVisor and MicroVM (Kata Containers) isolation, processing over 2 million isolated workloads per month. Their platform lets you choose the isolation level per workload, with gVisor pricing at $0.01667/vCPU-hour.
For teams that need better performance than MicroVMs and stronger isolation than containers, gVisor sits in a useful middle ground. The compatibility is good for most workloads: if your code runs in a standard Linux container, it will likely run under gVisor without modification.
WebAssembly: The Fastest, Most Restricted Option
WebAssembly (WASM) sandboxes code at the instruction level. Programs compile to a binary format that runs in a memory-safe virtual machine with automatic bounds checking on every memory access. There is no syscall interface at all: the sandbox exposes only the capabilities you explicitly grant.
Startup is roughly 10ms. Memory overhead is minimal. Execution speed approaches native for compiled languages like Rust, Go, and C.
The catch: most AI-generated code is Python. Running Python in WASM requires Pyodide (CPython compiled to WebAssembly), which works for many use cases but has limitations on C extensions, networking, and filesystem access. NVIDIA’s AI Red Team recommends WASM/Pyodide for browser-based execution of LLM-generated code, where it shifts compute to the client side and prevents server compromise entirely.
Who Uses WASM for AI Agents
Cloudflare Workers uses V8 isolates combined with WASM for edge execution. Their Sandbox SDK provides a purpose-built environment for running untrusted code with roughly 1ms startup for V8 isolates.
LangChain Sandbox uses Pyodide and Deno for sandboxed Python execution. The project explicitly states it is “not recommended for production use cases,” which tells you where WASM maturity stands for general-purpose agent workloads.
Hugging Face smolagents ships a WasmExecutor that runs agent-generated Python in Pyodide within a Deno runtime. This is the most accessible option for experimentation.
WASM is the right choice when you control the code format (compiled languages, well-defined Python subsets) and need maximum startup speed. It is the wrong choice when agents generate arbitrary Python that depends on C extensions, network access, or filesystem operations.
The Comparison: Which Sandbox for Which Agent
| Technology | Isolation Level | Startup | Memory Overhead | Python Support | Best For |
|---|---|---|---|---|---|
| Firecracker MicroVM | Hardware-enforced (own kernel) | ~125ms | <5 MiB | Full Linux | Untrusted code, multi-tenant |
| Kata Containers | Hardware-enforced | ~200ms | Higher | Full Linux | Production Kubernetes |
| gVisor | Syscall interception | 50-100ms | Medium | Full Linux | Compute-heavy SaaS |
| WASM/Pyodide | Runtime memory safety | ~10ms | Very low | Limited (no C ext) | Browser, well-defined tasks |
| V8 Isolates | Runtime isolation | ~1ms | Very low | No (JS/TS only) | Edge functions |
| Docker containers | Namespace-based (shared kernel) | 10-50ms | Very low | Full Linux | Trusted code only |
The decision depends on your threat model. If agents run user-submitted or prompt-injected code (which is every agent with a code generation tool), MicroVMs or gVisor are your options. If agents only execute pre-audited code from your own repository, containers may suffice. If you need sub-10ms startup for simple computations, WASM works when you can live within its constraints.
NVIDIA’s AI Red Team makes a critical point that simplifies the decision: “Performance overhead from virtualization is typically modest compared to LLM latency.” Your agent spends 500ms-5s waiting for the model to respond. Adding 125ms for a Firecracker boot is noise.
Beyond Isolation: Network and Filesystem Controls
A sandbox that isolates code execution but allows unrestricted network access is only half a sandbox. A compromised agent can still exfiltrate data, establish reverse shells, or communicate with command-and-control servers.
NVIDIA’s practical security guidance recommends three additional controls:
Network egress restrictions. Route all traffic through an HTTP proxy with domain allowlists. Block arbitrary outbound connections, restrict DNS resolution, and deny non-HTTP protocols. Docker Sandboxes implement this with a filtering proxy at host.docker.internal:3128. Anthropic’s Claude Code routes all network traffic through a unix domain socket proxy that enforces domain allowlists.
Filesystem write protection. Prevent writes outside the active workspace. NVIDIA specifically calls out shell initialization files (~/.zshrc, ~/.bashrc), Git configuration (~/.gitconfig), and binary directories (~/.local/bin) as targets that must be protected at the OS level, not the application level.
Configuration file lockdown. Agent configuration files (.cursorrules, CLAUDE.md, copilot-instructions.md, MCP server startup scripts) must be read-only to the sandboxed process. The IDEsaster disclosure in December 2025 found 30+ vulnerabilities across AI coding tools (Cursor, Windsurf, Kiro, GitHub Copilot, Zed, Roo Code, Junie, Cline) where prompt injection through these files led to code execution. Twenty-four CVEs were assigned.
Anthropic’s approach to Claude Code sandboxing combines these layers: bubblewrap on Linux and Apple’s Seatbelt framework on macOS provide filesystem isolation, while a network proxy enforces domain restrictions. The result: 84% fewer permission prompts in internal usage, because the sandbox handles security enforcement that users previously had to approve manually.
Designing Your Sandbox Architecture
The OWASP Agentic Top 10 introduces a principle that should guide your sandbox design: “Least-Agency.” Agents should receive the minimum level of autonomy required, with autonomy being “a feature to be earned, not a default setting.”
Applied to sandboxing, this means:
1. Ephemeral sandboxes. Create a fresh environment for each task. Destroy it when the task completes. This prevents credential accumulation, persistent backdoors, and state pollution between runs. NVIDIA recommends that “approvals should never be cached or persisted” across sandbox instances.
2. Tiered authorization. Not every agent action needs the same isolation level. NVIDIA proposes four tiers: enterprise-level blocks (non-negotiable restrictions), workspace-internal operations (auto-approved), allowlisted external operations (pre-approved domains and APIs), and default-deny for everything else.
3. Extend sandboxing to all spawned processes. Hooks, MCP server initialization, plugin loading, and skill execution all run code. If you sandbox only the main agent process but leave hooks unsandboxed, you have a bypass. CVE-2025-61260 in OpenAI’s Codex CLI exploited exactly this: MCP server entries executed at startup without user permission.
4. Monitor, log, and alert. Sandbox violations (blocked network requests, denied file writes, killed processes) are detection signals. Log them. Alert on patterns. A sandboxed agent that repeatedly tries to access /etc/shadow is telling you something.
Amazon Bedrock AgentCore packages these practices into a managed service with native support for LangChain, LangGraph, CrewAI, and Strands. If you do not want to build sandbox infrastructure yourself, managed services handle the isolation layer so you can focus on the agent logic.
Frequently Asked Questions
Why do AI agents need sandboxing?
AI agents generate and execute code that no human has reviewed. A prompt injection attack can cause the agent to run arbitrary commands with whatever permissions the agent process has. Sandboxing isolates the execution environment so that even fully compromised code cannot access the host system, network, or data. OWASP’s Top 10 for Agentic Applications (ASI05) explicitly requires sandboxed execution for all LLM-generated code.
What is the most secure way to sandbox AI agent code execution?
MicroVMs (Firecracker, Kata Containers) provide the strongest isolation because each workload gets its own kernel with hardware-enforced boundaries via KVM. Firecracker boots in roughly 125ms with less than 5 MiB overhead. gVisor offers strong isolation through syscall interception without the overhead of full virtualization. Standard Docker containers share the host kernel and are not considered a security boundary for untrusted code.
Can WebAssembly sandbox AI agent Python code?
Yes, using Pyodide (CPython compiled to WASM). WASM offers the fastest startup (~10ms) and strong memory safety. However, Pyodide has limitations: many C extensions do not work, networking is restricted, and filesystem access is limited. WASM is best for well-defined computation tasks, not general-purpose Python execution. LangChain’s WASM sandbox explicitly warns it is not recommended for production use.
What sandbox does Docker use for AI coding agents?
Docker Desktop 4.58 (January 2026) introduced Docker Sandboxes using MicroVMs, not standard containers. Each sandbox runs its own isolated Docker daemon inside a lightweight VM, with network traffic routed through a filtering proxy that performs TLS inspection. Docker Sandboxes support Claude Code, Codex CLI, Copilot CLI, Gemini CLI, and Kiro.
How does OWASP classify AI agent code execution risks?
The OWASP Top 10 for Agentic Applications (December 2025) classifies unexpected code execution as ASI05. It states that software-only sandboxing is insufficient and that all LLM-generated code must run in an isolated sandbox with no host system access. The framework also introduces the Least-Agency principle: agents should have the minimum autonomy required, with autonomy being a feature to be earned, not a default.
