Architecture Overview
agents-sandbox is a Docker-backed sandbox control plane with a local gRPC API, a layered Go SDK, an async Python SDK, and the AgentsSandbox CLI. The repository contains the daemon, the runtime backend, the protobuf contract, the Go and Python client layers, and a first-class local management CLI for sandbox lifecycle and exec operations.
System Architecture
The system is organized around one local daemon process, one runtime backend, and multiple caller-facing entry points built on the same Unix-socket gRPC contract.
flowchart LR
PySDK[Python SDK\nAgentsSandboxClient] --> RPC[gRPC over Unix socket]
GoClient[Go SDK\nsdk/go/client] --> GoRaw[Go raw client\nsdk/go/rawclient]
GoRaw --> RPC
CLI[AgentsSandbox CLI\nversion/ping\nsandbox create|list|get|delete|exec] --> RPC
RPC --> Daemon[AgentsSandbox daemon]
Daemon --> Service[control.Service]
Service --> Persistence[Persistent ids.db\nID registry + event store buckets]
Service --> Runtime[Docker runtime backend]
Runtime --> Docker[Docker daemon\nvia Engine API SDK]
Service --> Memory[In-memory sandbox and exec state\nwith full restart recovery]
Runtime --> StateRoot[Optional runtime state root\nfor copies and shadow copies]
Runtime --> Artifacts[Optional exec output artifacts]Main components
cmd/agboxdstarts the AgentsSandbox daemon, resolves config, initializes structured JSON logging via Go stdliblog/slog(written to stderr for systemd/journald capture), acquires the single-host lock, creates the service plus its runtime closer chain, and serves gRPC over a Unix domain socket.cmd/agboximplements the local operator CLI. It resolves the daemon socket, talks to the gRPC API throughsdk/go/rawclient, and exposesversion,ping, andsandboxsubcommands (create,list,get,delete,exec), including label-based list/delete flows and JSON output.internal/control.Serviceowns request validation, accepted-state transitions, in-memory sandbox and exec records, event ordering, sequence generation, async operation orchestration, full restart recovery with Docker inspect-based reconciliation, and retention cleanup.internal/control/id_registry.goowns the shared bbolt-backed persistence bootstrap that opensids.db, reserves caller-provided and daemon-generatedsandbox_id/exec_idvalues across daemon restarts, and shares the database handle with the persistent event store.internal/control/event_store.goowns the event-store abstraction plus the persistent bbolt implementation used to replay sandbox history after daemon restart and to retain deleted sandbox streams until cleanup.internal/control/docker_runtime.gois the concrete runtime backend owning a long-lived Docker Engine API client, filesystem input materialization, Docker network and container creation, exec commands, and resource removal.internal/profiledefines daemon-managed built-in resources. Tools (claude,codex,git,uv,npm,apt) resolve to named mounts; multiple tools may share a mount and the daemon deduplicates by mount ID.api/proto/service.protois the transport contract shared by the daemon, Go SDK, and Python SDK.sdk/go/rawclientcontains the synchronous transport-facing Go client with socket resolution, Unix socket dialing, raw RPCs, typed gRPC error translation, and raw event-stream primitive.sdk/go/clientcontains the public high-level Go SDK converting protobuf payloads into public Go types, direct-parameter lifecycle and exec APIs, wait behavior, and channel-based event consumption.sdk/pythoncontains a thin raw gRPC wrapper plus the public asyncAgentsSandboxClientwithwait=True/False, event-based waiting, sequence handling, and public handle models.
Primary request and event flow
- A client sends a gRPC request over the Unix socket.
- The service performs synchronous fail-fast validation for create inputs, service declarations, builtin resource IDs, historical ID reuse, and exec command shape.
- During
CreateSandboxandCreateExec, the daemon reserves the finalsandbox_idorexec_idin the persistent historical ID registry before accepting the request. When the caller omits an ID, the daemon generates and reserves a UUID v4 first. CreateSandbox,ResumeSandbox,StopSandbox,DeleteSandbox, andCreateExecreturn as accepted operations while the daemon continues convergence asynchronously.- The runtime backend performs Docker-side work through a shared Docker Engine API client and reports results back to the service. Required services gate readiness; optional services start in parallel with the primary and report their initial success or failure asynchronously without blocking sandbox readiness.
- The service persists ordered events and sandbox/exec configs before updating in-memory state, exposes their numeric ordering through event sequences, and performs full restart recovery by reconciling persisted state with Docker container inspect results after daemon restart.
- The Go and Python high-level SDKs, along with the AgentsSandbox CLI
sandbox execcommand, optionally wait by combining an authoritative baseline read withSubscribeSandboxEvents, whilesdk/go/rawclientkeeps the transport contract visible without adding high-level wait semantics.
Core Capabilities and Usage Scenarios
Sandbox lifecycle management
Each sandbox gets one primary container, one dedicated Docker network, zero or more service containers (required and optional), and ordered lifecycle and exec events. The CLI supports quick daemon reachability checks (agbox ping), sandbox creation/inspection/deletion (agbox sandbox create|get|delete), label-based fleet operations (agbox sandbox list, agbox sandbox delete --label), and ad hoc command execution (agbox sandbox exec).
Command execution and direct output consumption
Exec creation is asynchronous at the protocol layer. Exec stdout and stderr are redirected inside the container to bind-mounted host files, so the daemon is out of the I/O hot path and daemon restarts do not interrupt exec output. CreateExecResponse returns host-side log file paths (stdout_log_path, stderr_log_path) so callers can read output independently. The public SDKs expose create_exec(wait=False) for accepted async execution, create_exec(wait=True) for event-driven waiting, and run(...) as the direct "wait for completion and read log files" path.
Filesystem ingress and built-in resources
Sandbox creation supports three public filesystem ingress modes: mounts (explicit bind mounts), copies (daemon-owned copied content), and builtin_tools (daemon-defined resource shortcuts). Services are declared as required_services or optional_services and become sibling containers on the sandbox network. Required services must be healthy before the primary is reported ready; optional services start in parallel and report their initial result through sandbox events without blocking the primary ready transition.
Event subscription and replay
The daemon exposes a per-sandbox ordered event stream with full replay from from_sequence=0, daemon-issued event sequence anchors for incremental replay, and monotonic sequence numbers per sandbox. Each SandboxEvent carries a oneof details field (SandboxPhaseDetails, ExecEventDetails, or ServiceEventDetails). The top-level sandbox_state reflects the sandbox state when the event was emitted. CreateSandbox returns a SandboxHandle with the daemon-issued last_event_sequence cursor that seeds incremental event subscription without a snapshot/subscription race.
SDK layering and integration choices
The repository exposes three caller integration styles: sdk/go/rawclient for transport-level Go, sdk/go/client for most Go applications, and sdk/python for async Python applications. sdk/go/client.New() and Python AgentsSandboxClient() both resolve the socket path internally and expose direct-parameter lifecycle and exec methods. Protobuf request wrappers exist at the transport layer but are not the preferred public API.
Technical Constraints and External Dependencies
Runtime and deployment constraints
- The system is Docker-first. Runtime lifecycle, networking, container creation, and exec execution depend on a reachable Docker daemon.
- The daemon is a single-writer local control plane acquiring an exclusive host lock at a hardcoded platform path, refusing to start if another daemon already owns that lock.
- gRPC transport is exposed over a Unix domain socket only, at a hardcoded platform-specific path (not configurable).
- The current service keeps sandbox and exec projections in memory, but event history is persisted in
ids.dband replay survives daemon restart. See Daemon State Management for state classification and persistence rules. - A restarted daemon performs full state recovery by loading persisted sandbox/exec configs, replaying events, and reconciling with Docker container inspect results. Restored sandboxes support all operations without restriction.
- Caller-visible ID uniqueness is stronger than in-memory lifecycle state: the daemon persists historical
sandbox_idandexec_idreservations in a platform-derivedids.dbfile so old IDs remain unavailable after daemon restart. - Runtime stop, delete, and failed-create cleanup do not reuse caller RPC contexts. The service and runtime switch to daemon-owned background contexts so cleanup can finish even if the initiating request has ended.
Filesystem and security constraints
- Unsafe or invalid create inputs are rejected at the RPC boundary instead of being accepted and failing later in the background.
mountsandcopiesrequire absolute container targets and real host file or directory sources.copiesand builtin-tool shadow copies requireruntime.state_rootbecause the daemon materializes copied content into daemon-owned filesystem state.- Runtime exec assumes a non-root sandbox user model. Writable paths must remain writable to that runtime user.
- Built-in resources are daemon-defined. Callers can select capability IDs but cannot replace them with arbitrary hidden host paths through the public SDK surface.
External dependencies
- Go daemon and protocol implementation (structured logging via stdlib
log/slogwith JSON output) - Docker Engine API Go SDK and a reachable Docker daemon
- gRPC and protobuf for the wire contract
- Go gRPC client stack for
sdk/go/rawclientandsdk/go/client - Python
grpcioclient stack anduv-managed SDK environment - Optional host resources such as
SSH_AUTH_SOCK,~/.claude,~/.codex,~/.agents,~/.cache/uv,~/.local/share/uv, and local cache directories
Important Design Decisions and Reasons
- Accepted operations stay distinct from completed state. Slow operations return after acceptance, not after completion. See Protocol Design Principles for the full accepted-vs-completed contract and wait parameter convention.
- Exec snapshots join the sandbox event stream atomically.
GetExec().exec.last_event_sequenceanchors exec snapshots to the sandbox event stream, eliminating handoff races. See Protocol Design Principles for the atomic snapshot-to-stream handoff rules. - Historical IDs are reserved persistently. Caller-provided
sandbox_idandexec_idvalues are reserved in a persistent registry before accepting create operations, preventing accidental ID reuse after daemon restart. See Sandbox Container Lifecycle for the SubscribeSandboxEvents ordering contract and ID persistence rules. - Docker access stays on one structured client path. The runtime backend uses a single Docker Engine API client instead of Docker CLI subprocesses. See Container Dependency Strategy for the structured Docker API rule.
- The Go SDK is explicitly split into raw and high-level layers.
sdk/go/rawclientowns transport concerns;sdk/go/clientowns public Go types and wait behavior. - The public SDKs are direct-parameter, not request-wrapper driven. Both SDKs resolve the socket path internally and expose direct-parameter methods.
- Filesystem ingress is split by semantics.
mounts,copies, andbuiltin_toolsare separate concepts with different security and lifecycle behavior. See Container Dependency Strategy for detailed ingress rules, the non-root container model, and cleanup ownership. - Exec output is redirected to disk inside the container. Exec stdout/stderr are redirected to bind-mounted host files at
{ArtifactOutputRoot}/{sandbox_id}/{exec_id}.stdout.log/.stderr.log, keeping exec output durable across daemon restarts. See Sandbox Container Lifecycle for the exec output redirection contract. - Cleanup and ownership stay runtime-local. The daemon derives ownership from in-memory state plus namespaced Docker labels. See Container Dependency Strategy for cleanup rules.
- Event retention is bounded by TTL. Deleted sandbox event streams remain queryable until
runtime.event_retention_ttlexpires. See Sandbox Container Lifecycle for event retention rules.
Proto Generation
Go and Python bindings are generated from api/proto/service.proto using pinned tool versions:
| Tool | Version |
|---|---|
| protoc | v6.31.1 (release tag v31.1) |
| protoc-gen-go | v1.36.11 |
| protoc-gen-go-grpc | v1.6.1 |
| grpcio-tools | from sdk/python dev dependencies |
Regenerate bindings:
bash scripts/generate_proto.shThe script downloads and caches protoc in .local/protoc/ (project-local, git-ignored) and installs Go plugins in .local/go-bin/. CI runs scripts/lints/check_proto_consistency.sh automatically through run_test.sh lint to ensure checked-in bindings stay in sync with the proto source.