Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.celesto.ai/llms.txt

Use this file to discover all available pages before exploring further.

SmolVM is optimized for low-latency agent workflows: a fresh sandbox is ready to run a command in about a second, and follow-up commands return in milliseconds.

Benchmark results

Wall-clock timings on a standard Linux host (Alpine guest, 1 vCPU / 512 MB, KVM), measuring from SmolVM(...) to a command returning:
PhaseTime
Create~11ms
Launch~53ms
First command (vsock, default on Linux + QEMU)~1.28s
First command (SSH path, e.g. Firecracker)~1.67s
Warm command (vsock)~1.3ms
Warm command (SSH)~43ms
Stop + delete~751ms
Create → first command (QEMU + vsock default)~1.34s
Benchmarks measured on a 16-vCPU KVM host with the Alpine Linux guest. Numbers reflect the default boot profile after the boot-latency fixes shipped in mid-2026: a ~6× speed-up on QEMU defaults and a ~32× speed-up on warm commands once vsock is the working channel. See Control channel for what vsock is and when it kicks in.

Running your own benchmarks

You can benchmark SmolVM on your hardware using the included benchmark script:
python scripts/benchmarks/bench_subprocess.py --vms 10 -v
This will create, start, run commands in, and teardown 10 VMs sequentially, reporting detailed timing metrics.

Performance characteristics

Boot performance

  • MicroVM creation: SmolVM allocates IP addresses, TAP devices, and network rules in tens of milliseconds.
  • Time to first command: ~1.3s on the default QEMU + vsock path, ~1.7s on the SSH path (Firecracker today).
  • Hardware virtualization: Uses KVM on Linux and Hypervisor.framework on macOS for near-native performance.
  • Safe boot trims: The default MICROVM_DIRECT profile appends tsc=reliable no_timer_check quiet to the kernel cmdline. Set SMOLVM_VERBOSE_BOOT=1 to drop quiet when debugging a stuck boot.

Runtime performance

  • Command execution: ~1.3ms warm-command latency on vsock; ~43ms on SSH.
  • Memory overhead: Minimal host overhead beyond configured VM memory (default 512MB).
  • CPU efficiency: Hardware virtualization provides near-native CPU performance.

Teardown performance

  • Graceful Shutdown: Firecracker VMs can be stopped in ~751ms
  • Resource Cleanup: Network rules, TAP devices, and disk images are cleaned up automatically
  • Fast Path for Ephemeral VMs: SIGKILL-based teardown for sandbox VMs that don’t need state preservation

Optimization tips

1. Reuse VMs for multiple commands

Instead of creating a new VM for each command, reuse the same VM:
from smolvm import SmolVM

with SmolVM() as vm:
    # The context manager starts the VM automatically
    result1 = vm.run("apk add py3-requests")
    result2 = vm.run("python3 script.py")
    result3 = vm.run("cat output.txt")
This amortizes the ~1.3s first-command cost across many operations — and on vsock every follow-up command returns in around a millisecond.

2. Use appropriate resource allocation

Configure CPU and memory based on your workload:
from smolvm import SmolVM, VMConfig

# Lightweight workload
config = VMConfig(
    vcpu_count=1,
    memory=256,
)

# Heavy workload
config = VMConfig(
    vcpu_count=4,
    memory=2048,
)
Over-allocating resources can lead to host memory pressure and slower performance.

3. Pre-built custom images

For workloads requiring specific dependencies, build a custom rootfs image with pre-installed packages using ImageBuilder:
from smolvm.build import ImageBuilder
from smolvm import SmolVM, VMConfig
from smolvm.utils import ensure_ssh_key

private_key, public_key = ensure_ssh_key()
builder = ImageBuilder()
kernel, rootfs = builder.build_alpine_ssh_key(
    public_key,
    rootfs_size_mb=2048,
)

config = VMConfig(
    kernel_path=kernel,
    rootfs_path=rootfs,
    boot_args="console=ttyS0 reboot=k panic=1 init=/init",
)

with SmolVM(config, ssh_key_path=str(private_key)) as vm:
    vm.run("apk add python3 py3-requests py3-numpy")
This eliminates the need to install packages at runtime.

4. Shared vs isolated disk mode

Choose the appropriate disk mode for your use case: Isolated Mode (default): Each VM gets its own copy of the rootfs
  • ✅ Complete isolation between VMs
  • ✅ No cross-VM contamination
  • ❌ Higher disk usage
  • ❌ Copy overhead on first boot
Shared Mode: All VMs use the same rootfs image
  • ✅ No disk copy overhead
  • ✅ Lower disk usage
  • ❌ Changes persist across VMs
  • ❌ Potential cross-VM contamination
config = VMConfig(disk_mode="shared")  # For read-only workloads

5. Backend selection

SmolVM supports multiple backends with different performance characteristics:
  • Firecracker (Linux): Fastest boot times, lowest overhead, recommended for production
  • QEMU (macOS/Linux): Broader compatibility, slightly higher overhead
from smolvm import SmolVM

# Explicitly choose backend
vm = SmolVM(backend="firecracker")  # Linux only
vm = SmolVM(backend="qemu")         # macOS/Linux

Performance monitoring

Check VM status

Use the CLI to list all running VMs and their status:
smolvm list
Or check a specific VM from Python:
from smolvm import SmolVM

vm = SmolVM.from_id("my-vm")
info = vm.info
print(f"VM {vm.vm_id}: {info.status}")
print(f"  PID: {info.pid}")
if info.network:
    print(f"  IP: {info.network.guest_ip}")
vm.close()

Clean up stale VMs

Remove VMs marked as running but whose processes have died:
smolvm cleanup --all

Scalability considerations

IP address pool

By default, SmolVM allocates IPs from 172.16.0.2 to 172.16.0.254, supporting 253 concurrent VMs.

SSH port pool

Host-side SSH forwarding uses ports 2200-2999, supporting 800 concurrent VMs.

System limits

Check your system’s ulimit for open files and processes:
# Check file descriptor limit
ulimit -n

# Check process limit
ulimit -u

# Increase limits (add to /etc/security/limits.conf)
* soft nofile 65536
* hard nofile 65536

Profiling tips

Measure individual phases

import time
from smolvm import SmolVM

# Create + start phase
start = time.time()
vm = SmolVM()
vm.start()
print(f"Create + Start: {time.time() - start:.3f}s")

# Command execution
start = time.time()
vm.run("echo hello")
print(f"Command: {time.time() - start:.3f}s")

# Teardown
start = time.time()
vm.delete()
vm.close()
print(f"Teardown: {time.time() - start:.3f}s")

Network latency

Measure network roundtrip time:
import time
from smolvm import SmolVM

with SmolVM() as vm:
    start = time.time()
    result = vm.run("echo pong")
    elapsed = (time.time() - start) * 1000
    print(f"SSH roundtrip: {elapsed:.1f}ms")
Last modified on June 2, 2026