Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentium.in/llms.txt

Use this file to discover all available pages before exploring further.

Sandbox Agent

Why SandboxAgent?

Most agents are stateless: each run() starts fresh. But many real workloads aren’t:
  • Code agents that iteratively edit a file, run tests, fix the file, run tests again.
  • Research agents that take notes to disk across many turns.
  • Migration agents that clone a repo, transform files, commit, push.
  • Multi-day investigations that need to resume right where they left off.
SandboxAgent provides a persistent workspace — an isolated FS + shell + git checkout — that survives across runs and can be snapshotted + restored.

Architecture

                    ┌─────────────────────────────────────┐
                    │           SandboxAgent              │
                    │                                     │
                    │  ┌────────────────────────────┐    │
                    │  │  WorkspaceManifest         │    │
                    │  │   • files (seeded)         │    │
                    │  │   • gitClones              │    │
                    │  │   • env                    │    │
                    │  └────────────────────────────┘    │
                    │                                     │
                    │  ┌────────────────────────────┐    │
                    │  │  Backend                   │    │
                    │  │   "unix-local"  - tempdir  │    │
                    │  │   "docker"      - container│    │
                    │  │   "remote"      - CloudSbx │    │
                    │  └────────────────────────────┘    │
                    └─────────────────────────────────────┘


                       snapshot()  ──┴──  resume(snapshot)
                       (capture entire workspace)

Backends

BackendImplementationBest forDeps
"unix-local" (default)child_process.spawn in a system tempdirLocal dev, trusted CInone
"docker"Bind-mount tempdir into a containerUntrusted code, host isolationnpm i dockerode
"remote"Delegates to a CloudSandbox (E2B / Daytona)Production, multi-user, regulatednpm i @e2b/sdk or @daytonaio/sdk

Quick start

import { SandboxAgent } from "@agentium/core";

const agent = new SandboxAgent({
  backend: "unix-local",
  workspace: {
    env: { OPENAI_API_KEY: process.env.OPENAI_API_KEY! },
    files: [
      { path: "data.csv",    contents: "name,score\nalice,9\nbob,7\n" },
      { path: "src/main.ts", contents: 'console.log("hi");' },
    ],
    gitClones: [
      { repo: "https://github.com/agentiumOS/example-skill.git", path: "vendor/skill", ref: "v1.0.0" },
    ],
  },
});

await agent.start();

const r = await agent.run("console.log(require('fs').readdirSync('.'))", { language: "node" });
console.log(r.output);

await agent.close();

API

Constructor

interface SandboxAgentConfig {
  backend: "unix-local" | "docker" | "remote";
  remote?: CloudSandbox;          // required when backend === "remote"
  workspace?: WorkspaceManifest;
  dockerImage?: string;           // default "node:20-alpine"; only used for backend "docker"
}

interface WorkspaceManifest {
  files?: WorkspaceFile[];        // seeded into the workspace at start()
  gitClones?: { repo: string; path: string; ref?: string }[];
  env?: Record<string, string>;   // exposed to every run() / shell() call
}

interface WorkspaceFile {
  path: string;                   // relative to workspace root
  contents: string;               // utf-8 or base64 (per encoding field)
  encoding?: "utf8" | "base64";   // default "utf8"
}

Methods

start(): Promise<void>

Creates the workspace (tempdir or remote session), writes seeded files, runs gitClones. Idempotent — calling twice is a no-op. For backend: "remote", this also calls remote.start() and writes the seeded files into the remote sandbox via remote.writeFile().

run(code, options?): Promise<SandboxRunResult>

Execute code in the workspace. The language option picks the interpreter:
  • "node" (default): node -e "${code}"
  • "python": python3 -c "${code}"
  • "shell": passes code directly to /bin/sh -c
For backend: "remote", delegates to remote.run(code, options).
const r = await agent.run("import math; print(math.pi)", {
  language: "python",
  timeoutSeconds: 30,
  env: { LOG_LEVEL: "debug" },
});
console.log(r.output);   // "3.141592653589793\n"
console.log(r.exitCode); // 0
console.log(r.timedOut); // false
If the command exceeds timeoutSeconds, the child is killed with SIGKILL and the result has timedOut: true, exitCode: 124.

shell(command, options?): Promise<SandboxRunResult>

Same as run(command, { language: "shell" }) but more explicit:
await agent.shell("git status && ls -la");

writeFile(path, contents, encoding?): Promise<void>

Writes a file in the workspace. Creates parent directories automatically. For backend: "remote", delegates to remote.writeFile().

readFile(path, encoding?): Promise<string | null>

Reads a file. Returns null if the file doesn’t exist. For binary files, pass encoding: "base64".

snapshot(): Promise<WorkspaceSnapshot>

Captures the full workspace state — every file (base64-encoded), the env vars — and returns it as a plain object you can serialize and store.
const snap = await agent.snapshot();
await fs.writeFile("snapshot.json", JSON.stringify(snap));

interface WorkspaceSnapshot {
  takenAt: number;
  files: WorkspaceFile[];
  env: Record<string, string>;
}
For backend: "remote", snapshotting is provider-specific and currently returns an empty file list (use the cloud provider’s native snapshot API instead).

resume(snapshot): Promise<void>

Restores a workspace from a snapshot. Effectively a constructor + start() that materializes the files from the snapshot.
const next = new SandboxAgent({ backend: "unix-local" });
await next.resume(snap);
// Files are restored. Previous tempdir is unrelated.

close(): Promise<void>

Removes the local tempdir (unix-local / docker) or calls remote.close(). Always call this in a finally block.

ready: boolean

true after start() succeeds; false after close(). Read-only.

Compose with CloudSandbox

The killer combo is SandboxAgent + CloudSandbox — a persistent workspace in a hardened cloud VM:
import { E2BSandbox, SandboxAgent } from "@agentium/core";

const remote = new E2BSandbox({ template: "data-science" });

const agent = new SandboxAgent({
  backend: "remote",
  remote,
  workspace: {
    files: [{ path: "data.csv", contents: csvData }],
  },
});

await agent.start();
await agent.run("import pandas; print(pandas.read_csv('data.csv').describe())", { language: "python" });
await agent.close();

Compose with Agent

SandboxAgent is not itself an LLM-driven agent — it’s a workspace. Plug it into a regular Agent by exposing its methods as tools:
import { Agent, defineTool, openai, SandboxAgent } from "@agentium/core";
import { z } from "zod";

const sandbox = new SandboxAgent({ backend: "unix-local" });
await sandbox.start();

const tools = [
  defineTool({
    name: "shell",
    description: "Run a shell command in the workspace.",
    parameters: z.object({ command: z.string() }),
    execute: async ({ command }) => {
      const r = await sandbox.shell(command);
      return JSON.stringify(r);
    },
  }),
  defineTool({
    name: "writeFile",
    description: "Write a file at the given workspace path.",
    parameters: z.object({ path: z.string(), contents: z.string() }),
    execute: async ({ path, contents }) => {
      await sandbox.writeFile(path, contents);
      return "ok";
    },
  }),
];

const agent = new Agent({ name: "code-bot", model: openai("gpt-4o"), tools });
await agent.run("Create a Node script that prints hello.");
(A higher-level helper createSandboxTools(sandbox) may land in a future release; for now wire them yourself.)

Persistence across processes

A common pattern: a long-running investigation where each user turn is a separate process.
// Turn 1
const agent = new SandboxAgent({ backend: "unix-local", workspace: {...} });
await agent.start();
// ... do work ...
const snap = await agent.snapshot();
await redis.set(`session:${id}:snapshot`, JSON.stringify(snap));
await agent.close();

// Turn 2 (different process)
const saved = JSON.parse(await redis.get(`session:${id}:snapshot`));
const next = new SandboxAgent({ backend: "unix-local" });
await next.resume(saved);
// ... continue work ...

Failure modes

SituationBehavior
start() before workspace deps availableThrows — e.g. "dockerode is required" if backend: "docker" and SDK missing
run() / shell() exceeds timeoutSecondsReturns { timedOut: true, exitCode: 124, output }
readFile on missing pathReturns null
writeFile outside workspace tempdirPath is joined with the workspace root via path.join — escapes are blocked by the filesystem itself (you’re inside a tempdir owned by your process)
close() called twiceNo-op the second time

See also