Documentation Index
Fetch the complete documentation index at: https://docs.agentium.in/llms.txt
Use this file to discover all available pages before exploring further.
Sandbox Agent
Why SandboxAgent?
Most agents are stateless: each run() starts fresh. But many real workloads aren’t:
- Code agents that iteratively edit a file, run tests, fix the file, run tests again.
- Research agents that take notes to disk across many turns.
- Migration agents that clone a repo, transform files, commit, push.
- Multi-day investigations that need to resume right where they left off.
SandboxAgent provides a persistent workspace — an isolated FS + shell + git checkout — that survives across runs and can be snapshotted + restored.
Architecture
Backends
| Backend | Implementation | Best for | Deps |
|---|---|---|---|
"unix-local" (default) | child_process.spawn in a system tempdir | Local dev, trusted CI | none |
"docker" | Bind-mount tempdir into a container | Untrusted code, host isolation | npm i dockerode |
"remote" | Delegates to a CloudSandbox (E2B / Daytona) | Production, multi-user, regulated | npm i @e2b/sdk or @daytonaio/sdk |
Quick start
API
Constructor
Methods
start(): Promise<void>
Creates the workspace (tempdir or remote session), writes seeded files, runs gitClones. Idempotent — calling twice is a no-op.
For backend: "remote", this also calls remote.start() and writes the seeded files into the remote sandbox via remote.writeFile().
run(code, options?): Promise<SandboxRunResult>
Execute code in the workspace. The language option picks the interpreter:
"node"(default):node -e "${code}""python":python3 -c "${code}""shell": passescodedirectly to/bin/sh -c
backend: "remote", delegates to remote.run(code, options).
timeoutSeconds, the child is killed with SIGKILL and the result has timedOut: true, exitCode: 124.
shell(command, options?): Promise<SandboxRunResult>
Same as run(command, { language: "shell" }) but more explicit:
writeFile(path, contents, encoding?): Promise<void>
Writes a file in the workspace. Creates parent directories automatically. For backend: "remote", delegates to remote.writeFile().
readFile(path, encoding?): Promise<string | null>
Reads a file. Returns null if the file doesn’t exist. For binary files, pass encoding: "base64".
snapshot(): Promise<WorkspaceSnapshot>
Captures the full workspace state — every file (base64-encoded), the env vars — and returns it as a plain object you can serialize and store.
backend: "remote", snapshotting is provider-specific and currently returns an empty file list (use the cloud provider’s native snapshot API instead).
resume(snapshot): Promise<void>
Restores a workspace from a snapshot. Effectively a constructor + start() that materializes the files from the snapshot.
close(): Promise<void>
Removes the local tempdir (unix-local / docker) or calls remote.close(). Always call this in a finally block.
ready: boolean
true after start() succeeds; false after close(). Read-only.
Compose with CloudSandbox
The killer combo is SandboxAgent + CloudSandbox — a persistent workspace in a hardened cloud VM:
Compose with Agent
SandboxAgent is not itself an LLM-driven agent — it’s a workspace. Plug it into a regular Agent by exposing its methods as tools:
createSandboxTools(sandbox) may land in a future release; for now wire them yourself.)
Persistence across processes
A common pattern: a long-running investigation where each user turn is a separate process.Failure modes
| Situation | Behavior |
|---|---|
start() before workspace deps available | Throws — e.g. "dockerode is required" if backend: "docker" and SDK missing |
run() / shell() exceeds timeoutSeconds | Returns { timedOut: true, exitCode: 124, output } |
readFile on missing path | Returns null |
writeFile outside workspace tempdir | Path is joined with the workspace root via path.join — escapes are blocked by the filesystem itself (you’re inside a tempdir owned by your process) |
close() called twice | No-op the second time |
See also
- Cloud Sandbox Toolkits — for stateless code execution
- Computer Use Agent — for GUI control instead of code
@agentium/queue— for long-running background work