Performance benchmarks

SmolVM is optimized for low-latency agent workflows: fresh sandboxes can be ready quickly on Linux, and follow-up commands return in milliseconds.

Latest QEMU published Ubuntu medians

These warm-cache medians come from the SmolVM benchmark timeline for the published Ubuntu image on June 23, 2026. “Total ready” measures the time until the sandbox control channel is ready. “First command” measures the first command after readiness. “Warm exec” measures repeated command latency.

Backend	Transport	Total ready	First command	Warm exec	Context
QEMU	SSH	1152.2 ms	12.0 ms	42.7 ms	Published Ubuntu, local warm-cache run
QEMU	vsock	413.1 ms	1.2 ms	1.0 ms	64.1% faster ready than QEMU SSH in this run

The table above reports the current QEMU microvm benchmark lane. Keep backend-to-backend comparisons separate so each runtime is measured in its representative setup.

Running your own benchmarks

Run the same transport benchmark on your hardware:

uv run python scripts/benchmarks/ubuntu_transport.py \
  --variants qemu-ssh,qemu-vsock \
  --iterations 3 \
  --warm-exec-runs 5 \
  --rootfs-source published \
  --output /tmp/smolvm-ubuntu-transport.json \
  -v

Measure disk helper performance separately:

uv run python scripts/benchmarks/disk_io.py \
  --iterations 3 \
  --json \
  --output /tmp/smolvm-disk-io.json

Performance characteristics

Boot performance

Sandbox creation: SmolVM allocates names, IPs, disk metadata, and network rules in tens of milliseconds.
Time to ready: QEMU + vsock reaches readiness in 413.1 ms on the latest published Ubuntu run; QEMU + SSH reaches readiness in 1152.2 ms.
QEMU microvm default: On Linux x86_64 direct-kernel guests, QEMU uses the smaller microvm machine by default.
Hardware virtualization: SmolVM uses KVM on Linux and Hypervisor.framework on macOS for near-native performance.
Safe boot trims: The default MICROVM_DIRECT profile appends tsc=reliable no_timer_check quiet to the kernel command line. Set SMOLVM_VERBOSE_BOOT=1 to drop quiet when debugging a stuck boot.

Runtime performance

Command execution: Warm-command latency is about 1.0 ms on vsock and about 43 ms on SSH.
File transfer: New guest-agent builds use the newer streaming file-transfer protocol. Only compare transfer numbers after the published image advertises those capabilities.
Memory overhead: Minimal host overhead beyond configured VM memory (default 512MB).
CPU efficiency: Hardware virtualization provides near-native CPU performance.

Native helper performance

The smolvm-core wheel gives SmolVM Rust-backed helpers for the host operations that happen around each sandbox:

Networking: TAP setup, route changes, and sysctls can use direct Linux calls when SmolVM has the right permissions.
Disk I/O: zstd decompression uses a native path, while sparse copy keeps the host’s cp fast path when it is already best.
QEMU control: Pause, resume, and snapshot control use a native QMP client.
Firecracker control: Firecracker API socket requests use the native transport.

Latest disk-helper validation:

Operation	Size	Native path	Forced-off path	Result
Sparse copy	16 MiB	10.5 ms (`cp`)	10.2 ms (`cp`)	Unchanged; `cp` remains first
Sparse copy	128 MiB	64.6 ms (`cp`)	64.8 ms (`cp`)	Unchanged; `cp` remains first
zstd decompress	16 MiB	13.5 ms	40.6 ms	66.8% faster
zstd decompress	128 MiB	96.4 ms	376.1 ms	74.4% faster

Check your installed helper capabilities:

python -m smolvm_core

Teardown performance

Graceful shutdown: SmolVM asks the backend to stop cleanly before removing local state.
Resource Cleanup: Network rules, TAP devices, and disk images are cleaned up automatically
Fast Path for Ephemeral VMs: SIGKILL-based teardown for sandbox VMs that don’t need state preservation

Optimization tips

1. Reuse VMs for multiple commands

Instead of creating a new VM for each command, reuse the same VM:

from smolvm import SmolVM

with SmolVM() as vm:
    # The context manager starts the VM automatically
    result1 = vm.run("apk add py3-requests")
    result2 = vm.run("python3 script.py")
    result3 = vm.run("cat output.txt")

This amortizes the fresh-sandbox ready time across many operations. On the latest QEMU + vsock Ubuntu run, readiness was 413.1 ms and follow-up commands returned in about a millisecond.

2. Use appropriate resource allocation

Configure CPU and memory based on your workload:

from smolvm import SmolVM

# Lightweight workload
with SmolVM(memory=256) as vm:
    print(vm.run("echo small").stdout)

# Heavy workload
with SmolVM(memory=2048) as vm:
    print(vm.run("python3 --version").stdout)

Over-allocating resources can lead to host memory pressure and slower performance.

3. Pre-built custom images

For workloads requiring specific dependencies, build a custom rootfs image with pre-installed packages using ImageBuilder:

from smolvm.build import ImageBuilder
from smolvm import SmolVM, VMConfig
from smolvm.utils import ensure_ssh_key

private_key, public_key = ensure_ssh_key()
builder = ImageBuilder()
kernel, rootfs = builder.build_alpine_ssh_key(
    public_key,
    rootfs_size_mb=2048,
)

config = VMConfig(
    kernel_path=kernel,
    rootfs_path=rootfs,
    boot_args="console=ttyS0 reboot=k panic=1 init=/init",
)

with SmolVM(config, ssh_key_path=str(private_key)) as vm:
    vm.run("apk add python3 py3-requests py3-numpy")

This eliminates the need to install packages at runtime.

4. Shared vs isolated disk mode

Choose the appropriate disk mode for your use case: Isolated Mode (default): Each VM gets its own copy of the rootfs

✅ Complete isolation between VMs
✅ No cross-VM contamination
❌ Higher disk usage
❌ Copy overhead on first boot

Shared Mode: All VMs use the same rootfs image

✅ No disk copy overhead
✅ Lower disk usage
❌ Changes persist across VMs
❌ Potential cross-VM contamination

Use shared mode only for custom VMConfig flows where the root filesystem image is read-only for your workload. Keep the default isolated mode for agent sandboxes and anything that writes to the guest disk.

5. Backend selection

SmolVM supports multiple backends with different performance characteristics:

Firecracker (Linux): Low overhead and a narrow device model, recommended for Linux production.
QEMU (macOS/Linux): Broad compatibility, Windows guest support, and a faster microvm path on Linux x86_64.
libkrun: Experimental runtime testing. It does not support snapshots yet.

from smolvm import SmolVM

# Explicitly choose backend
vm = SmolVM(backend="firecracker")  # Linux only
vm = SmolVM(backend="qemu")         # macOS/Linux

Performance monitoring

Check VM status

Use the CLI to list all running VMs and their status:

smolvm sandbox list

Or check a specific VM from Python:

from smolvm import SmolVM

vm = SmolVM.from_id("my-vm")
info = vm.info
print(f"VM {vm.vm_id}: {info.status}")
print(f"  PID: {info.pid}")
if info.network:
    print(f"  IP: {info.network.guest_ip}")
vm.close()

Clean up stale VMs

Remove VMs marked as running but whose processes have died:

smolvm sandbox delete --all --force

Scalability considerations

IP address pool

By default, SmolVM allocates IPs from 172.16.0.2 to 172.16.0.254, supporting 253 concurrent VMs.

SSH port pool

Host-side SSH forwarding uses ports 2200-2999, supporting 800 concurrent VMs.

System limits

Check your system’s ulimit for open files and processes:

# Check file descriptor limit
ulimit -n

# Check process limit
ulimit -u

# Increase limits (add to /etc/security/limits.conf)
* soft nofile 65536
* hard nofile 65536

Profiling tips

Measure individual phases

import time
from smolvm import SmolVM

# Create + start phase
start = time.time()
vm = SmolVM()
vm.start()
print(f"Create + Start: {time.time() - start:.3f}s")

# Command execution
start = time.time()
vm.run("echo hello")
print(f"Command: {time.time() - start:.3f}s")

# Teardown
start = time.time()
vm.delete()
vm.close()
print(f"Teardown: {time.time() - start:.3f}s")

Network latency

Measure network roundtrip time:

import time
from smolvm import SmolVM

with SmolVM() as vm:
    start = time.time()
    result = vm.run("echo pong")
    elapsed = (time.time() - start) * 1000
    print(f"Control-channel roundtrip: {elapsed:.1f}ms")

Get Started

Features

Guides

Architecture

Operations

Performance benchmarks

Latest QEMU published Ubuntu medians

Running your own benchmarks

Performance characteristics

Boot performance

Runtime performance

Native helper performance

Teardown performance

Optimization tips

1. Reuse VMs for multiple commands

2. Use appropriate resource allocation

3. Pre-built custom images

4. Shared vs isolated disk mode

5. Backend selection

Performance monitoring

Check VM status

Clean up stale VMs

Scalability considerations

IP address pool

SSH port pool

System limits

Profiling tips

Measure individual phases

Network latency

​Latest QEMU published Ubuntu medians

​Running your own benchmarks

​Performance characteristics

​Boot performance

​Runtime performance

​Native helper performance

​Teardown performance

​Optimization tips

​1. Reuse VMs for multiple commands

​2. Use appropriate resource allocation

​3. Pre-built custom images

​4. Shared vs isolated disk mode

​5. Backend selection

​Performance monitoring

​Check VM status

​Clean up stale VMs

​Scalability considerations

​IP address pool

​SSH port pool

​System limits

​Profiling tips

​Measure individual phases

​Network latency

Latest QEMU published Ubuntu medians

Running your own benchmarks

Performance characteristics

Boot performance

Runtime performance

Native helper performance

Teardown performance

Optimization tips

1. Reuse VMs for multiple commands

2. Use appropriate resource allocation

3. Pre-built custom images

4. Shared vs isolated disk mode

5. Backend selection

Performance monitoring

Check VM status

Clean up stale VMs

Scalability considerations

IP address pool

SSH port pool

System limits

Profiling tips

Measure individual phases

Network latency