Skip to content

Architecture Overview

agents-sandbox is a Docker-backed sandbox control plane with a local gRPC API, a layered Go SDK, an async Python SDK, and the AgentsSandbox CLI. The repository contains the daemon, the runtime backend, the protobuf contract, the Go and Python client layers, and a first-class local management CLI for sandbox lifecycle and exec operations.

System Architecture

The system is organized around one local daemon process, one runtime backend, and multiple caller-facing entry points built on the same Unix-socket gRPC contract.

mermaid
flowchart LR
    PySDK[Python SDK\nAgentsSandboxClient] --> RPC[gRPC over Unix socket]
    GoClient[Go SDK\nsdk/go/client] --> GoRaw[Go raw client\nsdk/go/rawclient]
    GoRaw --> RPC
    CLI[AgentsSandbox CLI\nversion/ping\nsandbox create|list|get|delete|exec] --> RPC
    RPC --> Daemon[AgentsSandbox daemon]
    Daemon --> Service[control.Service]
    Service --> Persistence[Persistent ids.db\nID registry + event store buckets]
    Service --> Runtime[Docker runtime backend]
    Runtime --> Docker[Docker daemon\nvia Engine API SDK]
    Service --> Memory[In-memory sandbox and exec state\nwith full restart recovery]
    Runtime --> StateRoot[Optional runtime state root\nfor copies and shadow copies]
    Runtime --> Artifacts[Optional exec output artifacts]

Main components

  • cmd/agboxd starts the AgentsSandbox daemon, resolves config, initializes structured JSON logging via Go stdlib log/slog (written to stderr for systemd/journald capture), acquires the single-host lock, creates the service plus its runtime closer chain, and serves gRPC over a Unix domain socket.
  • cmd/agbox implements the local operator CLI. It resolves the daemon socket, talks to the gRPC API through sdk/go/rawclient, and exposes version, ping, and sandbox subcommands (create, list, get, delete, exec), including label-based list/delete flows and JSON output.
  • internal/control.Service owns request validation, accepted-state transitions, in-memory sandbox and exec records, event ordering, sequence generation, async operation orchestration, full restart recovery with Docker inspect-based reconciliation, and retention cleanup.
  • internal/control/id_registry.go owns the shared bbolt-backed persistence bootstrap that opens ids.db, reserves caller-provided and daemon-generated sandbox_id/exec_id values across daemon restarts, and shares the database handle with the persistent event store.
  • internal/control/event_store.go owns the event-store abstraction plus the persistent bbolt implementation used to replay sandbox history after daemon restart and to retain deleted sandbox streams until cleanup.
  • internal/control/docker_runtime.go is the concrete runtime backend owning a long-lived Docker Engine API client, filesystem input materialization, Docker network and container creation, exec commands, and resource removal.
  • internal/profile defines daemon-managed built-in resources. Tools (claude, codex, git, uv, npm, apt) resolve to named mounts; multiple tools may share a mount and the daemon deduplicates by mount ID.
  • api/proto/service.proto is the transport contract shared by the daemon, Go SDK, and Python SDK.
  • sdk/go/rawclient contains the synchronous transport-facing Go client with socket resolution, Unix socket dialing, raw RPCs, typed gRPC error translation, and raw event-stream primitive.
  • sdk/go/client contains the public high-level Go SDK converting protobuf payloads into public Go types, direct-parameter lifecycle and exec APIs, wait behavior, and channel-based event consumption.
  • sdk/python contains a thin raw gRPC wrapper plus the public async AgentsSandboxClient with wait=True/False, event-based waiting, sequence handling, and public handle models.

Primary request and event flow

  1. A client sends a gRPC request over the Unix socket.
  2. The service performs synchronous fail-fast validation for create inputs, service declarations, builtin resource IDs, historical ID reuse, and exec command shape.
  3. During CreateSandbox and CreateExec, the daemon reserves the final sandbox_id or exec_id in the persistent historical ID registry before accepting the request. When the caller omits an ID, the daemon generates and reserves a UUID v4 first.
  4. CreateSandbox, ResumeSandbox, StopSandbox, DeleteSandbox, and CreateExec return as accepted operations while the daemon continues convergence asynchronously.
  5. The runtime backend performs Docker-side work through a shared Docker Engine API client and reports results back to the service. Required services gate readiness; optional services start in parallel with the primary and report their initial success or failure asynchronously without blocking sandbox readiness.
  6. The service persists ordered events and sandbox/exec configs before updating in-memory state, exposes their numeric ordering through event sequences, and performs full restart recovery by reconciling persisted state with Docker container inspect results after daemon restart.
  7. The Go and Python high-level SDKs, along with the AgentsSandbox CLI sandbox exec command, optionally wait by combining an authoritative baseline read with SubscribeSandboxEvents, while sdk/go/rawclient keeps the transport contract visible without adding high-level wait semantics.

Core Capabilities and Usage Scenarios

Sandbox lifecycle management

Each sandbox gets one primary container, one dedicated Docker network, zero or more service containers (required and optional), and ordered lifecycle and exec events. The CLI supports quick daemon reachability checks (agbox ping), sandbox creation/inspection/deletion (agbox sandbox create|get|delete), label-based fleet operations (agbox sandbox list, agbox sandbox delete --label), and ad hoc command execution (agbox sandbox exec).

Command execution and direct output consumption

Exec creation is asynchronous at the protocol layer. Exec stdout and stderr are redirected inside the container to bind-mounted host files, so the daemon is out of the I/O hot path and daemon restarts do not interrupt exec output. CreateExecResponse returns host-side log file paths (stdout_log_path, stderr_log_path) so callers can read output independently. The public SDKs expose create_exec(wait=False) for accepted async execution, create_exec(wait=True) for event-driven waiting, and run(...) as the direct "wait for completion and read log files" path.

Filesystem ingress and built-in resources

Sandbox creation supports three public filesystem ingress modes: mounts (explicit bind mounts), copies (daemon-owned copied content), and builtin_tools (daemon-defined resource shortcuts). Services are declared as required_services or optional_services and become sibling containers on the sandbox network. Required services must be healthy before the primary is reported ready; optional services start in parallel and report their initial result through sandbox events without blocking the primary ready transition.

Event subscription and replay

The daemon exposes a per-sandbox ordered event stream with full replay from from_sequence=0, daemon-issued event sequence anchors for incremental replay, and monotonic sequence numbers per sandbox. Each SandboxEvent carries a oneof details field (SandboxPhaseDetails, ExecEventDetails, or ServiceEventDetails). The top-level sandbox_state reflects the sandbox state when the event was emitted. CreateSandbox returns a SandboxHandle with the daemon-issued last_event_sequence cursor that seeds incremental event subscription without a snapshot/subscription race.

SDK layering and integration choices

The repository exposes three caller integration styles: sdk/go/rawclient for transport-level Go, sdk/go/client for most Go applications, and sdk/python for async Python applications. sdk/go/client.New() and Python AgentsSandboxClient() both resolve the socket path internally and expose direct-parameter lifecycle and exec methods. Protobuf request wrappers exist at the transport layer but are not the preferred public API.

Technical Constraints and External Dependencies

Runtime and deployment constraints

  • The system is Docker-first. Runtime lifecycle, networking, container creation, and exec execution depend on a reachable Docker daemon.
  • The daemon is a single-writer local control plane acquiring an exclusive host lock at a hardcoded platform path, refusing to start if another daemon already owns that lock.
  • gRPC transport is exposed over a Unix domain socket only, at a hardcoded platform-specific path (not configurable).
  • The current service keeps sandbox and exec projections in memory, but event history is persisted in ids.db and replay survives daemon restart. See Daemon State Management for state classification and persistence rules.
  • A restarted daemon performs full state recovery by loading persisted sandbox/exec configs, replaying events, and reconciling with Docker container inspect results. Restored sandboxes support all operations without restriction.
  • Caller-visible ID uniqueness is stronger than in-memory lifecycle state: the daemon persists historical sandbox_id and exec_id reservations in a platform-derived ids.db file so old IDs remain unavailable after daemon restart.
  • Runtime stop, delete, and failed-create cleanup do not reuse caller RPC contexts. The service and runtime switch to daemon-owned background contexts so cleanup can finish even if the initiating request has ended.

Filesystem and security constraints

  • Unsafe or invalid create inputs are rejected at the RPC boundary instead of being accepted and failing later in the background.
  • mounts and copies require absolute container targets and real host file or directory sources.
  • copies and builtin-tool shadow copies require runtime.state_root because the daemon materializes copied content into daemon-owned filesystem state.
  • Runtime exec assumes a non-root sandbox user model. Writable paths must remain writable to that runtime user.
  • Built-in resources are daemon-defined. Callers can select capability IDs but cannot replace them with arbitrary hidden host paths through the public SDK surface.

External dependencies

  • Go daemon and protocol implementation (structured logging via stdlib log/slog with JSON output)
  • Docker Engine API Go SDK and a reachable Docker daemon
  • gRPC and protobuf for the wire contract
  • Go gRPC client stack for sdk/go/rawclient and sdk/go/client
  • Python grpcio client stack and uv-managed SDK environment
  • Optional host resources such as SSH_AUTH_SOCK, ~/.claude, ~/.codex, ~/.agents, ~/.cache/uv, ~/.local/share/uv, and local cache directories

Important Design Decisions and Reasons

  • Accepted operations stay distinct from completed state. Slow operations return after acceptance, not after completion. See Protocol Design Principles for the full accepted-vs-completed contract and wait parameter convention.
  • Exec snapshots join the sandbox event stream atomically. GetExec().exec.last_event_sequence anchors exec snapshots to the sandbox event stream, eliminating handoff races. See Protocol Design Principles for the atomic snapshot-to-stream handoff rules.
  • Historical IDs are reserved persistently. Caller-provided sandbox_id and exec_id values are reserved in a persistent registry before accepting create operations, preventing accidental ID reuse after daemon restart. See Sandbox Container Lifecycle for the SubscribeSandboxEvents ordering contract and ID persistence rules.
  • Docker access stays on one structured client path. The runtime backend uses a single Docker Engine API client instead of Docker CLI subprocesses. See Container Dependency Strategy for the structured Docker API rule.
  • The Go SDK is explicitly split into raw and high-level layers. sdk/go/rawclient owns transport concerns; sdk/go/client owns public Go types and wait behavior.
  • The public SDKs are direct-parameter, not request-wrapper driven. Both SDKs resolve the socket path internally and expose direct-parameter methods.
  • Filesystem ingress is split by semantics. mounts, copies, and builtin_tools are separate concepts with different security and lifecycle behavior. See Container Dependency Strategy for detailed ingress rules, the non-root container model, and cleanup ownership.
  • Exec output is redirected to disk inside the container. Exec stdout/stderr are redirected to bind-mounted host files at {ArtifactOutputRoot}/{sandbox_id}/{exec_id}.stdout.log / .stderr.log, keeping exec output durable across daemon restarts. See Sandbox Container Lifecycle for the exec output redirection contract.
  • Cleanup and ownership stay runtime-local. The daemon derives ownership from in-memory state plus namespaced Docker labels. See Container Dependency Strategy for cleanup rules.
  • Event retention is bounded by TTL. Deleted sandbox event streams remain queryable until runtime.event_retention_ttl expires. See Sandbox Container Lifecycle for event retention rules.

Proto Generation

Go and Python bindings are generated from api/proto/service.proto using pinned tool versions:

ToolVersion
protocv6.31.1 (release tag v31.1)
protoc-gen-gov1.36.11
protoc-gen-go-grpcv1.6.1
grpcio-toolsfrom sdk/python dev dependencies

Regenerate bindings:

bash
bash scripts/generate_proto.sh

The script downloads and caches protoc in .local/protoc/ (project-local, git-ignored) and installs Go plugins in .local/go-bin/. CI runs scripts/lints/check_proto_consistency.sh automatically through run_test.sh lint to ensure checked-in bindings stay in sync with the proto source.