Skip to main content
SmolVM gives your AI agents a safe place to run code. Instead of executing LLM-generated commands directly on your machine, SmolVM spins up an isolated microVM in milliseconds, runs the code inside it, and tears it down when finished. This guide shows you how to plug SmolVM into popular agent frameworks as a tool.
Install the agent framework extras before following these examples:
pip install "smolvm[examples]"
This adds PydanticAI, OpenAI Agents SDK, LangChain, and Playwright as dependencies. See installation for details.

Why SmolVM for AI agents?

When AI agents generate and execute code, you need strong isolation to prevent:
  • Host compromise - Malicious code escaping to your system
  • Data exfiltration - Unauthorized access to sensitive files
  • Resource abuse - Uncontrolled CPU/memory/network usage
  • Persistent side effects - State pollution across tasks
SmolVM uses hardware virtualization (KVM-backed microVMs) rather than containers, providing a significantly smaller attack surface.

Hardware isolation

KVM-based virtualization provides stronger isolation than containers. Escape requires a hypervisor exploit, not just a kernel vulnerability.

Controlled networking

Fine-grained control over guest internet access. Restrict or monitor all network traffic.

Ephemeral environments

Spin up a fresh VM for every task and destroy immediately after. No persistent state between tasks.

Resource limits

Strict CPU and memory limits prevent resource exhaustion attacks.

Agentor

Agentor is Celesto’s own agent framework. It has built-in SmolVM support through SmolVMRuntime, so your agent’s shell commands run inside a sandbox automatically — no glue code needed. Pass SmolVMRuntime as the executor for ShellTool and Agentor routes every command through a dedicated microVM:
from agentor import Agentor
from agentor.runtime import SmolVMRuntime
from agentor.tools import ShellTool


def main() -> None:
    runtime = SmolVMRuntime(mem_size_mib=1024, disk_size_mib=2048)

    try:
        agent = Agentor(
            name="SmolVM Shell Agent",
            model="gpt-5",
            tools=[ShellTool(executor=runtime)],
            instructions="Use shell commands to inspect files inside the SmolVM sandbox.",
        )

        result = agent.run(
            "Install uv and use the Python interpreter to print 'Hello, World!'. Return both outputs."
        )
        print(result)
    finally:
        runtime.close()


if __name__ == "__main__":
    main()
SmolVMRuntime manages the VM lifecycle for you. When the agent calls the shell tool, the command runs inside the microVM instead of on your host. Call runtime.close() when you are done to tear down the sandbox.
See the full working example in main.py. Install with pip install agentor smolvm.

When to use Agentor vs. other frameworks

Use Agentor when you want the simplest path to a sandboxed agent — one import for the runtime, one for the tool, and you are done. If you already use PydanticAI, OpenAI Agents, or LangChain, keep reading for framework-specific patterns below.

PydanticAI

Register SmolVM as a PydanticAI tool so the agent can run shell commands inside an ephemeral sandbox. Each call spins up a fresh VM, runs the command, and tears it down automatically.
from smolvm import SmolVM
from pydantic_ai import Agent

def run_in_smolvm(command: str, timeout: int = 30) -> str:
    """Run a shell command inside an ephemeral SmolVM sandbox.

    Args:
        command: Shell command to execute inside the sandbox guest.
        timeout: Maximum number of seconds to wait for the command.
    """
    with SmolVM() as vm:
        result = vm.run(command, timeout=timeout)
        return (
            f"exit_code: {result.exit_code}\n"
            f"stdout:\n{result.stdout.strip() or '<empty>'}\n"
            f"stderr:\n{result.stderr.strip() or '<empty>'}"
        )

agent = Agent(
    "openai:gpt-4.1",
    instructions=(
        "You are a coding assistant with access to a secure SmolVM sandbox. "
        "For shell or Python inspection requests, call run_in_smolvm exactly "
        "once and then summarize the result."
    ),
)
agent.tool_plain(docstring_format="google", require_parameter_descriptions=True)(
    run_in_smolvm
)

result = agent.run_sync("Run `uname -a && python3 --version` in the sandbox.")
print(result.output)
See the full working example in examples/agent_tools/pydanticai_tool.py.

Reusable sandbox across turns

If your agent needs to maintain state between tool calls (for example, writing a file in one turn and reading it in the next), keep the VM alive across invocations. The helper functions below create the sandbox on first use and reconnect on subsequent calls:
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
from smolvm import SmolVM

@dataclass
class SandboxDeps:
    vm_id: str | None = None

def _connect_vm(deps: SandboxDeps) -> SmolVM:
    """Return the active sandbox, creating it on first use."""
    if deps.vm_id is None:
        vm = SmolVM()
        vm.start()
        deps.vm_id = vm.vm_id
        return vm
    return SmolVM.from_id(deps.vm_id)

def _cleanup_vm(vm_id: str | None) -> None:
    """Delete the reusable sandbox if one was created."""
    if vm_id is None:
        return
    vm = SmolVM.from_id(vm_id)
    try:
        vm.delete()
    finally:
        vm.close()

def run_in_reusable_smolvm(
    ctx: RunContext[SandboxDeps], command: str, timeout: int = 30
) -> str:
    """Run a shell command inside a reusable SmolVM sandbox.

    Args:
        command: Shell command to execute inside the sandbox guest.
        timeout: Maximum number of seconds to wait for the command.
    """
    vm = _connect_vm(ctx.deps)
    try:
        result = vm.run(command, timeout=timeout)
        return (
            f"exit_code: {result.exit_code}\n"
            f"stdout:\n{result.stdout.strip() or '<empty>'}\n"
            f"stderr:\n{result.stderr.strip() or '<empty>'}"
        )
    finally:
        vm.close()

agent = Agent(
    "openai:gpt-4.1",
    deps_type=SandboxDeps,
    instructions="You have access to a persistent SmolVM sandbox.",
)
agent.tool(docstring_format="google", require_parameter_descriptions=True)(
    run_in_reusable_smolvm
)

deps = SandboxDeps()
try:
    agent.run_sync("Write 'hello' to /tmp/note.txt", deps=deps)
    agent.run_sync("Read /tmp/note.txt and confirm the contents", deps=deps)
finally:
    _cleanup_vm(deps.vm_id)
See the full working example in examples/agent_tools/pydanticai_reusable_tool.py.

OpenAI Agents SDK

Use SmolVM as a function tool in the OpenAI Agents SDK:
import asyncio
from agents import Agent, Runner, function_tool
from smolvm import SmolVM

def run_in_smolvm(command: str, timeout: int = 30) -> str:
    """Run a shell command inside an ephemeral SmolVM sandbox.

    Args:
        command: Shell command to execute inside the sandbox guest.
        timeout: Maximum number of seconds to wait for the command.
    """
    with SmolVM() as vm:
        result = vm.run(command, timeout=timeout)
        return (
            f"exit_code: {result.exit_code}\n"
            f"stdout:\n{result.stdout.strip() or '<empty>'}\n"
            f"stderr:\n{result.stderr.strip() or '<empty>'}"
        )

agent = Agent(
    name="SmolVM Assistant",
    model="gpt-4.1",
    instructions=(
        "You are a coding assistant with access to a secure SmolVM sandbox. "
        "For shell or Python inspection requests, call run_in_smolvm exactly "
        "once and then summarize the result."
    ),
    tools=[function_tool(run_in_smolvm)],
)

async def main():
    result = await Runner.run(
        agent, "Run `uname -a && python3 --version` in the sandbox."
    )
    print(result.final_output)

asyncio.run(main())
See the full working example in examples/agent_tools/openai_agents_tool.py. Install with pip install smolvm openai-agents.

LangChain

Wrap SmolVM as a LangChain tool:
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from smolvm import SmolVM

@tool
def run_in_smolvm(command: str, timeout: int = 30) -> str:
    """Run a shell command inside an ephemeral SmolVM sandbox.

    Args:
        command: Shell command to execute inside the sandbox guest.
        timeout: Maximum number of seconds to wait for the command.
    """
    with SmolVM() as vm:
        result = vm.run(command, timeout=timeout)
        return (
            f"exit_code: {result.exit_code}\n"
            f"stdout:\n{result.stdout.strip() or '<empty>'}\n"
            f"stderr:\n{result.stderr.strip() or '<empty>'}"
        )

llm = ChatOpenAI(model="gpt-4.1")
agent = create_react_agent(llm, [run_in_smolvm])

result = agent.invoke({
    "messages": [{"role": "user", "content": "Run uname -a in the sandbox"}]
})
print(result["messages"][-1].content)
See the full working example in examples/agent_tools/langchain_tool.py. Install with pip install smolvm langchain-openai langgraph.

Browser sessions

SmolVM includes a built-in BrowserSession class that launches a Chromium browser inside a microVM. You can use it for web scraping, testing, and computer-use agents that need to interact with real web pages in an isolated environment.
from smolvm import BrowserSession, BrowserSessionConfig, SmolVMError

with BrowserSession(
    BrowserSessionConfig(
        mode="live",
        record_video=True,
        viewport={"width": 1440, "height": 900},
    )
) as session:
    print(f"Session: {session.session_id}")
    print(f"CDP URL: {session.cdp_url}")
    print(f"Live URL: {session.live_url}")

    # Connect with Playwright (requires: pip install playwright)
    browser = session.connect_playwright()
    context = browser.contexts[0] if browser.contexts else browser.new_context()
    page = context.pages[0] if context.pages else context.new_page()
    page.goto("https://example.com", wait_until="networkidle")

    # Take a screenshot
    session.screenshot("screenshot.png")
See the full working example in examples/browser_session.py.

Browser session modes

ModeDescription
headlessNo visual output. Good for automated scraping and testing.
liveExposes a noVNC live-view URL so you can watch the browser in real time.

Browser session configuration

BrowserSessionConfig accepts these options:
ParameterTypeDefaultDescription
mode"headless" | "live""headless"Display mode. live enables noVNC.
viewport{"width": int, "height": int}1280 x 720Browser viewport size.
record_videoboolFalseRecord a video (requires mode="live").
profile_mode"ephemeral" | "persistent""ephemeral"Whether to reuse browser state across sessions.
profile_idstrRequired when profile_mode="persistent".
timeout_minutesint30Auto-shutdown timer (1-240 minutes).
allow_downloadsboolTrueAllow file downloads in the browser.
env_varsdict[str, str]{}Environment variables injected into the guest.
mem_size_mibint2048Guest memory in MiB (512-16384).
disk_size_mibint4096Root filesystem size in MiB (2048-16384).
from smolvm import BrowserSession, BrowserSessionConfig

with BrowserSession(
    BrowserSessionConfig(
        mode="live",
        viewport={"width": 1920, "height": 1080},
        timeout_minutes=60,
        mem_size_mib=4096,
        env_vars={"MY_API_KEY": "sk-..."},
    )
) as session:
    print(session.cdp_url)

Computer-use with OpenAI

Combine BrowserSession with OpenAI’s computer-use API for autonomous web browsing agents. The model sees the browser through screenshots and sends back click, type, and scroll instructions. You provide a task and an optional starting URL. SmolVM launches a browser in an isolated sandbox, and the model drives it step by step until the task is complete.
from openai import OpenAI
from smolvm import BrowserSession, BrowserSessionConfig

client = OpenAI()

with BrowserSession(
    BrowserSessionConfig(mode="live", viewport={"width": 1440, "height": 900})
) as session:
    browser = session.connect_playwright()
    context = browser.contexts[0] if browser.contexts else browser.new_context()
    page = context.pages[0] if context.pages else context.new_page()
    page.goto("https://example.com", wait_until="domcontentloaded")

    # Send the initial task to the model with the computer tool
    response = client.responses.create(
        model="gpt-5.4",
        tools=[{"type": "computer"}],
        input="Find the main heading on the page.",
    )

    # Process computer-use actions in a loop
    for item in response.output:
        if item.type == "computer_call":
            # Execute browser actions based on the model's instructions
            # (click, type, scroll, screenshot, etc.)
            pass
The full example includes domain allowlisting so the model can only visit URLs you approve, automatic retries for failed actions, and a configurable step limit to keep costs under control.
See the full working example in examples/agent_tools/computer_use_browser.py for a complete computer-use loop with action handling, safety checks, and domain allowlisting.

PydanticAI with agent-browser

You can let a PydanticAI agent drive a SmolVM browser session through the agent-browser CLI instead of using Playwright directly. The agent runs host-side shell commands — starting the browser, taking snapshots, clicking elements, and capturing screenshots — all through a single run_host_bash tool. This approach is useful when you want the LLM to decide what to do in the browser step by step, without writing Playwright code yourself. Here is how the pieces fit together: Prerequisites:
pip install smolvm pydantic-ai
brew install agent-browser   # or: npm install -g agent-browser
agent-browser install
export OPENAI_API_KEY=...
smolvm doctor
How it works:
  1. The agent calls smolvm browser start --live --json to launch an isolated browser session.
  2. SmolVM returns a JSON payload with a session_id, a cdp_url (including the localhost port), and a live_url.
  3. The agent reads agent-browser --help, then uses agent-browser --cdp <cdp_port> commands to take snapshots, click elements, and navigate pages.
  4. The agent can save screenshots and collect artifacts along the way.
  5. When finished, the agent calls smolvm browser stop <session_id> to tear down the session.
from __future__ import annotations

import json
import subprocess
from dataclasses import dataclass
from typing import Any
from urllib.parse import urlparse

from pydantic_ai import Agent, RunContext

@dataclass
class BrowserCliDeps:
    """Tracks the active browser session for cleanup."""
    session: dict[str, Any] | None = None

def run_host_bash(
    ctx: RunContext[BrowserCliDeps], command: str, timeout: int = 60
) -> str:
    """Run a shell command on the host machine.

    Args:
        command: Shell command to execute on the host.
        timeout: Maximum seconds to wait.
    """
    result = subprocess.run(
        ["bash", "-lc", command],
        capture_output=True, text=True, timeout=timeout, check=False,
    )

    # Capture session info when the browser starts
    parsed_session = None
    if (
        result.returncode == 0
        and "browser start" in command
        and "--json" in command
    ):
        try:
            data = json.loads(result.stdout).get("data", {})
            cdp_url = data["cdp_url"]
            cdp_port = urlparse(cdp_url).port
            parsed_session = {
                "session_id": data["session_id"],
                "cdp_url": cdp_url,
                "cdp_port": cdp_port,
                "live_url": data.get("live_url"),
            }
            ctx.deps.session = parsed_session
        except (json.JSONDecodeError, KeyError, ValueError):
            pass

    lines = [
        f"exit_code: {result.returncode}",
        "stdout:",
        result.stdout.strip() or "<empty>",
        "stderr:",
        result.stderr.strip() or "<empty>",
    ]
    if parsed_session is not None:
        lines.extend([
            "parsed_browser_session:",
            f"session_id: {parsed_session['session_id']}",
            f"cdp_port: {parsed_session['cdp_port']}",
            f"live_url: {parsed_session.get('live_url') or '<none>'}",
        ])
    return "\n".join(lines)

agent = Agent(
    "openai:gpt-5.4",
    deps_type=BrowserCliDeps,
    instructions=(
        "You are an agent who has access to control browser using some tools and CLIs.\n"
        "You must reason and plan before act.\n"
        "You automate one SmolVM browser session from the host.\n"
        "First read `agent-browser --help`.\n"
        "Decide on the exact commands before you run them.\n"
        "Follow this workflow exactly:\n"
        "1. First run `smolvm browser start --live --json`.\n"
        "2. Read `cdp_port` from the `parsed_browser_session` section in the tool output.\n"
        "3. Use `agent-browser --cdp <cdp_port>` on every browser command.\n"
        "4. Use `agent-browser --cdp <cdp_port> snapshot -i --json` before choosing refs.\n"
        "5. Save the final screenshot.\n"
        "6. Stop the browser with `smolvm browser stop <session_id>` when done.\n"
        "7. Return only these four lines: title, url, screenshot_path, session_id.\n"
        "Only use the `run_host_bash` tool, and keep each command simple.\n"
        "Your final output must contain a human readable summary."
    ),
)
agent.tool(docstring_format="google", require_parameter_descriptions=True)(
    run_host_bash
)

deps = BrowserCliDeps()
try:
    result = agent.run_sync(
        "Open https://example.com, take a snapshot, then stop the session.",
        deps=deps,
    )
    print(result.output)
finally:
    # Safety net: stop the session if the agent didn't
    if deps.session:
        subprocess.run(
            ["smolvm", "browser", "stop", deps.session["session_id"]],
            capture_output=True, timeout=60, check=False,
        )
The key difference from the Playwright-based approach is that the LLM decides its own browsing strategy. It reads agent-browser --help to learn the available commands, plans its steps, takes snapshots (with --json output) to understand page structure, and picks elements by reference ID.
See the full working example in examples/agent_tools/pydanticai_agent_browser.py for CDP port parsing, error handling, and a multi-step demo that navigates between pages and saves screenshots.

Generic tool pattern

If you use a framework not listed above, the core pattern is the same. Define a function that creates a SmolVM, runs a command, and returns the output:
from smolvm import SmolVM

def run_in_smolvm(command: str, timeout: int = 30) -> str:
    """Run a shell command inside an ephemeral SmolVM sandbox."""
    with SmolVM() as vm:
        result = vm.run(command, timeout=timeout)
        if result.ok:
            return result.stdout
        return f"Error (exit {result.exit_code}): {result.stderr}"
Then register this function as a tool in whatever framework you use.

Long-running agent environments

For agents that need to maintain state across multiple interactions without using a reusable tool pattern:
1

Create a persistent VM

from smolvm import SmolVM, VMConfig
from smolvm.build import ImageBuilder, SSH_BOOT_ARGS
from smolvm.utils import ensure_ssh_key

private_key, public_key = ensure_ssh_key()
builder = ImageBuilder()
kernel, rootfs = builder.build_alpine_ssh_key(
    public_key,
    rootfs_size_mb=4096,
)

config = VMConfig(
    vm_id="agent-workspace",
    vcpu_count=2,
    mem_size_mib=2048,
    kernel_path=kernel,
    rootfs_path=rootfs,
    boot_args=SSH_BOOT_ARGS,
)

vm = SmolVM(config, ssh_key_path=str(private_key))
vm.start()

# Install dependencies once
vm.run("apk add python3 py3-pip git")
print(f"Workspace ready: {vm.vm_id}")
vm.close()  # Release handle, keep VM running
2

Reconnect for each task

vm = SmolVM.from_id("agent-workspace")
result = vm.run("git clone https://github.com/user/repo", timeout=120)
result = vm.run("cd repo && python3 analyze.py", timeout=120)
vm.close()
3

Clean up when done

vm = SmolVM.from_id("agent-workspace")
vm.delete()
vm.close()

Best practices

Use ephemeral VMs for untrusted code

# Fresh VM for each execution - automatically deleted after use
def execute_untrusted_code(code: str) -> str:
    with SmolVM() as vm:
        result = vm.run(code)
        return result.output

Always set timeouts

# Prevent hanging from infinite loops or stalled commands
result = vm.run("potentially-slow-command", timeout=30)

Inject secrets via environment variables

# Pass secrets through env_vars, not command-line arguments
config = VMConfig(
    env_vars={"API_KEY": os.getenv("API_KEY")},
    # ... other config ...
)

Set resource limits

# Limit resources to prevent abuse
config = VMConfig(
    vcpu_count=1,
    mem_size_mib=512,
    # ... other config ...
)

Error handling

from smolvm import SmolVM
from smolvm.exceptions import (
    SmolVMError,
    CommandExecutionUnavailableError,
    OperationTimeoutError,
)

def safe_execute(code: str) -> dict:
    """Execute code with comprehensive error handling."""
    try:
        with SmolVM() as vm:
            result = vm.run(code, timeout=30)
            return {
                "success": result.ok,
                "output": result.stdout,
                "error": result.stderr if not result.ok else None,
                "exit_code": result.exit_code,
            }
    except OperationTimeoutError:
        return {"success": False, "error": "Execution timed out"}
    except CommandExecutionUnavailableError as e:
        return {"success": False, "error": f"Cannot execute commands: {e.reason}"}
    except SmolVMError as e:
        return {"success": False, "error": f"VM error: {str(e)}"}

Next steps

Basic usage

Learn fundamental SmolVM operations

Custom images

Build specialized images for your agents

Environment variables

Configure agent environments dynamically

Port forwarding

Expose agent services to your host
Last modified on April 5, 2026