Architecture Overview
agents-sandbox is a Docker-backed sandbox control plane with a local gRPC API, a layered Go SDK, an async Python SDK, and the AgentsSandbox CLI. The repository contains the daemon, the runtime backend, the protobuf contract, the Go and Python client layers, and a first-class local management CLI for sandbox lifecycle and exec operations.
System Architecture
The system is organized around one local daemon process, one runtime backend, and multiple caller-facing entry points built on the same Unix-socket gRPC contract.
flowchart LR
PySDK[Python SDK\nAgentsSandboxClient] --> RPC[gRPC over Unix socket]
GoClient[Go SDK\nsdk/go/client] --> GoRaw[Go raw client\nsdk/go/rawclient]
GoRaw --> RPC
CLI[AgentsSandbox CLI\nversion/ping\nsandbox create|list|get|delete|exec] --> RPC
RPC --> Daemon[AgentsSandbox daemon]
Daemon --> Service[control.Service]
Service --> Persistence[Persistent ids.db\nID registry + event store buckets]
Service --> Runtime[Docker runtime backend]
Runtime --> Docker[Docker daemon\nvia Engine API SDK]
Service --> Memory[In-memory sandbox and exec state\nwith full restart recovery]
Runtime --> StateRoot[Optional runtime state root\nfor copies and shadow copies]
Runtime --> Artifacts[Optional exec output artifacts]Main components
cmd/agboxdstarts the AgentsSandbox daemon, resolves config, initializes structured JSON logging via Go stdliblog/slog(written to stderr for systemd/journald capture), acquires the single-host lock, creates the service plus its runtime closer chain, and serves gRPC over a Unix domain socket.cmd/agboximplements the local operator CLI. It resolves the daemon socket, talks to the gRPC API throughsdk/go/rawclient, and exposesversion,ping, andsandboxsubcommands. Thesandboxcommand currently supportscreate,list,get,delete, andexec, including label-based list/delete flows and JSON output for create/list/get.internal/control.Serviceowns request validation, accepted-state transitions, in-memory sandbox and exec records, event ordering, sequence generation, async operation orchestration, full restart recovery with Docker inspect-based reconciliation, and retention cleanup.internal/control/id_registry.goowns the shared bbolt-backed persistence bootstrap that opensids.db, reserves caller-provided and daemon-generatedsandbox_id/exec_idvalues across daemon restarts, and shares the database handle with the persistent event store.internal/control/event_store.goowns the event-store abstraction plus the persistent bbolt implementation used to replay sandbox history after daemon restart and to retain deleted sandbox streams until cleanup.internal/control/docker_runtime.gois the concrete runtime backend. It owns a long-lived Docker Engine API client, materializes filesystem inputs, creates Docker networks and containers, runs exec commands, and removes runtime-owned resources.internal/profiledefines daemon-managed built-in resources. Tools (claude,codex,git,uv,npm,apt) are the user-facing names; each tool resolves to one or more named mounts (.claude,.codex,.agents,gh-auth,ssh-agent,uv-cache,uv-data,npm,apt). Multiple tools may share a mount; the daemon deduplicates by mount ID.api/proto/service.protois the transport contract shared by the daemon, the Go SDK, and the Python SDK.sdk/go/rawclientcontains the synchronous transport-facing Go client that resolves the default socket path, dials the Unix socket, wraps raw RPCs, translates typed gRPC errors, and exposes the raw event-stream primitive used by both the CLI and the high-level SDK.sdk/go/clientcontains the public high-level Go SDK. It converts protobuf payloads into public Go types, exposes direct-parameter lifecycle and exec APIs, addswaitbehavior, and bridges sandbox events into Go channels.sdk/pythoncontains a thin raw gRPC wrapper plus the public asyncAgentsSandboxClient, which addswait=True/False, event-based waiting, sequence handling, and public handle models.
Primary request and event flow
- A client sends a gRPC request over the Unix socket.
- The service performs synchronous fail-fast validation for create inputs, service declarations, builtin resource IDs, historical ID reuse, and exec command shape.
- During
CreateSandboxandCreateExec, the daemon reserves the finalsandbox_idorexec_idin the persistent historical ID registry before accepting the request. When the caller omits an ID, the daemon generates and reserves a UUID v4 first. CreateSandbox,ResumeSandbox,StopSandbox,DeleteSandbox, andCreateExecreturn as accepted operations while the daemon continues convergence asynchronously.- The runtime backend performs Docker-side work through a shared Docker Engine API client and reports results back to the service. Required services gate readiness; optional services start in parallel with the primary and report their initial success or failure asynchronously without blocking sandbox readiness.
- The service persists ordered events and sandbox/exec configs before updating in-memory state, exposes their numeric ordering through event sequences, and performs full restart recovery by reconciling persisted state with Docker container inspect results after daemon restart.
- The Go and Python high-level SDKs, along with the AgentsSandbox CLI
sandbox execcommand, optionally wait on top of that contract by combining an authoritative baseline read withSubscribeSandboxEvents, whilesdk/go/rawclientkeeps the transport contract visible without adding high-level wait semantics.
Core Capabilities And Usage Scenarios
Sandbox lifecycle management
The daemon creates, resumes, stops, deletes, and lists sandboxes. The AgentsSandbox CLI exposes that same lifecycle surface for local operators. Each sandbox gets:
- one primary container
- one dedicated Docker network
- zero or more service containers (required and optional)
- ordered lifecycle and exec events
This is the core path for products that need an isolated coding or execution environment with explicit lifecycle ownership.
The CLI is useful for:
- quick daemon reachability checks through
agbox ping - sandbox creation, inspection, and deletion through
agbox sandbox create|get|delete - label-based fleet operations through
agbox sandbox listandagbox sandbox delete --label - ad hoc command execution inside an existing sandbox through
agbox sandbox exec
Command execution and direct output consumption
Exec creation is asynchronous at the protocol layer. Exec stdout and stderr are redirected inside the container to bind-mounted host files, so the daemon is completely out of the I/O hot path and daemon restarts do not interrupt exec output. CreateExecResponse returns the host-side log file paths (stdout_log_path, stderr_log_path) so callers can read output independently. The public Go and Python SDKs expose the same lifecycle contract through language-appropriate APIs:
create_exec(..., wait=False)for accepted async executioncreate_exec(..., wait=True)for event-driven waitingrun(...)as the direct "wait for completion and read log files" path
The AgentsSandbox CLI sandbox exec command uses the same accepted-state plus event-stream contract to wait for terminal exec state and print command output directly to the terminal.
In Go, those calls live on sdk/go/client, while sdk/go/rawclient keeps the underlying accepted-state RPCs and raw event stream available to tools that want transport-level control. This supports both orchestration workflows and simple request-response command execution without relying on workspace result files.
Filesystem ingress and built-in resources
Sandbox creation supports three public filesystem ingress modes:
mountsfor explicit bind mountscopiesfor daemon-owned copied contentbuiltin_toolsfor daemon-defined resource shortcuts
These cover common scenarios such as:
- mounting a local project tree at
/workspace - copying seed files or fixture data into a sandbox
- exposing operator tooling such as
claude,codex,git, oruv
Services are declared explicitly as required_services or optional_services and become sibling containers on the sandbox network. Required services must be healthy before the primary is reported ready; optional services start in parallel and report their initial ready or failed result through sandbox events without blocking the primary ready transition. This supports cases such as adding a database or service sidecar next to the primary runtime image.
Event subscription and replay
The daemon exposes a per-sandbox ordered event stream with:
- full replay from
from_sequence=0 - daemon-issued event sequence anchors for incremental replay
- monotonic
sequencenumbers per sandbox - optional current-state snapshots for active exec visibility
Each SandboxEvent carries a oneof details field that is one of SandboxPhaseDetails (sandbox lifecycle transitions, errors, and stop reasons), ExecEventDetails (exec state and exit code), or ServiceEventDetails (service ready or failed). The top-level sandbox_state field reflects the sandbox state at the time the event was emitted, regardless of which details variant is set.
CreateSandbox returns CreateSandboxResponse containing a full SandboxHandle with the reserved sandbox_id, initial state, and the daemon-issued last_event_sequence cursor that seeds incremental event subscription without a snapshot/subscription race.
This supports long-running orchestration, reconnect after temporary client loss, and SDK-side waiting without pretending accepted operations are already complete.
SDK layering and integration choices
The repository now exposes three caller integration styles:
sdk/go/rawclientfor transport-level Go integrations that want protobuf requests, direct RPC access, typed error translation, and manual event-stream control.sdk/go/clientfor most Go applications that want public Go types, direct-parameter methods,waithelpers, and channel-based event consumption.sdk/pythonfor async Python applications that want the same lifecycle semantics with Python-native async iteration.
This separation keeps the wire contract stable while letting each language expose caller-friendly northbound APIs.
Technical Constraints And External Dependencies
Runtime and deployment constraints
- The system is Docker-first. Runtime lifecycle, networking, container creation, and exec execution depend on a reachable Docker daemon.
- The daemon is a single-writer local control plane. It acquires an exclusive host lock at a hardcoded platform path and refuses to start if another daemon already owns that lock.
- gRPC transport is exposed over a Unix domain socket only, at a hardcoded platform-specific path (not configurable).
- The current service still keeps sandbox and exec projections in memory, but event history is persisted in
ids.dband replay survives daemon restart. SeeDaemon State Managementfor the full state classification, persistence rules, and restart recovery contract. - A restarted daemon performs full state recovery by loading persisted sandbox/exec configs, replaying events, and reconciling with Docker container inspect results. Restored sandboxes support all operations (exec, stop, resume, delete) without restriction.
- Caller-visible ID uniqueness is stronger than in-memory lifecycle state: the daemon persists historical
sandbox_idandexec_idreservations in a platform-derivedids.dbfile so old IDs remain unavailable after daemon restart. - Runtime stop, delete, and failed-create cleanup do not reuse caller RPC contexts. The service and runtime switch to daemon-owned background contexts so cleanup can finish even if the initiating request has already ended.
Filesystem and security constraints
- Unsafe or invalid create inputs are rejected at the RPC boundary instead of being accepted and failing later in the background.
mountsandcopiesrequire absolute container targets and real host file or directory sources.copiesand builtin-tool shadow copies requireruntime.state_rootbecause the daemon materializes copied content into daemon-owned filesystem state.- Runtime exec assumes a non-root sandbox user model. Writable paths must remain writable to that runtime user.
- Built-in resources are daemon-defined. Callers can select capability IDs but cannot replace them with arbitrary hidden host paths through the public SDK surface.
External dependencies
- Go daemon and protocol implementation (structured logging via stdlib
log/slogwith JSON output) - Docker Engine API Go SDK and a reachable Docker daemon
- gRPC and protobuf for the wire contract
- Go gRPC client stack for
sdk/go/rawclientandsdk/go/client - Python
grpcioclient stack anduv-managed SDK environment - Optional host resources such as
SSH_AUTH_SOCK,~/.claude,~/.codex,~/.agents,~/.cache/uv,~/.local/share/uv, and local cache directories
Important Design Decisions And Reasons
Accepted operations stay distinct from completed state
Slow operations return after acceptance, not after completion. The service then exposes authoritative state through GetSandbox and GetExec, plus ordered events through SubscribeSandboxEvents. This keeps the protocol honest about asynchronous runtime work while still allowing the SDK to offer convenient waiting.
Exec snapshots join the sandbox event stream atomically
Exec waits rely on sandbox events, so the protocol cannot ask SDKs to stitch an exec snapshot to a sandbox stream with a sequence anchor borrowed from some later or different read. GetExec().exec.last_event_sequence anchors the exec snapshot to the same ordered sandbox event stream, which lets SDK wait paths subscribe without a handoff race and without fallback polling.
Historical IDs are reserved persistently, not only while a sandbox is live
Caller-provided sandbox_id and exec_id values are part of the external contract, so the daemon does not treat them as temporary in-memory handles. It reserves them in a persistent registry before accepting create operations, which prevents accidental ID reuse after daemon restart and keeps conflict detection independent from Docker object discovery or any external product database.
Docker access stays on one structured client path
The runtime backend uses a single Docker Engine API client per service instance instead of spawning Docker CLI subprocesses for inspect, lifecycle, image, or exec work. This keeps Docker interactions on structured API surfaces, removes text-parsing dependencies, and makes shutdown explicit because the runtime client is returned as part of the daemon's closer chain.
The Go SDK is explicitly split into raw and high-level layers
sdk/go/rawclient owns socket resolution, dialing, raw RPC calls, error translation, and raw event streams. sdk/go/client owns public Go types, wait defaults, sequence-based wait paths, and channel-based subscription. Keeping those concerns separate lets transport-aware tools stay close to protobuf while ordinary Go callers get a smaller, more idiomatic API.
The public SDKs are direct-parameter, not request-wrapper driven
sdk/go/client.New() and Python AgentsSandboxClient() both resolve the socket path internally and expose direct-parameter lifecycle and exec methods. Protobuf request wrappers still exist at the transport layer, but they are no longer the preferred public northbound API for ordinary callers. This keeps the high-level SDK surfaces smaller and matches how callers actually use the service.
Filesystem ingress is split by semantics
mounts, copies, and builtin_tools are separate concepts because they have different security and lifecycle behavior. A bind mount keeps a live host path, a copy materializes daemon-owned content, and a built-in resource is a daemon-defined shortcut with its own validation and resolution rules. Keeping them separate avoids stringly typed overloading and keeps fail-fast validation clear.
Built-in resources remain daemon-owned capabilities
Tool names such as claude, codex, git, uv, and npm are resolved by the daemon into their underlying mounts, not by caller-supplied hidden path conventions. This keeps host-sensitive path logic centralized and lets the daemon decide when bind mounting is safe and when shadow-copy fallback is required. Multiple tools may share a mount; the daemon deduplicates by mount ID before materializing.
Exec output is redirected to disk inside the container
Exec stdout and stderr are redirected inside the container to bind-mounted host files ({ArtifactOutputRoot}/{sandbox_id}/{exec_id}.stdout.log / .stderr.log). The daemon does not attach to exec output streams and carries zero I/O buffer per exec. CreateExecResponse returns the host-side log paths so SDK callers can read output independently. This design keeps exec output durable across daemon restarts.
Cleanup and ownership stay runtime-local
The daemon derives ownership from in-memory sandbox state plus namespaced Docker labels, and it removes primary containers, service containers (both required and optional), dedicated networks, and daemon-owned filesystem state during delete and failed create cleanup. Cleanup runs on daemon-owned contexts instead of request-scoped cancellation, which keeps teardown reliable after accepted async operations and failed materialization paths.
Proto Generation
Go and Python bindings are generated from api/proto/service.proto using pinned tool versions:
| Tool | Version |
|---|---|
| protoc | v6.31.1 (release tag v31.1) |
| protoc-gen-go | v1.36.11 |
| protoc-gen-go-grpc | v1.6.1 |
| grpcio-tools | from sdk/python dev dependencies |
Regenerate bindings:
bash scripts/generate_proto.shThe script downloads and caches protoc in .local/protoc/ (project-local, git-ignored) and installs Go plugins in .local/go-bin/. CI runs scripts/lints/check_proto_consistency.sh automatically through run_test.sh lint to ensure checked-in bindings stay in sync with the proto source.
Related Documents
README.mddocs/sdk_go_usage.mddocs/sdk_async_usage.mddocs/configuration_reference.mddocs/sandbox_container_lifecycle.mddocs/container_dependency_strategy.mddocs/mount_and_copy_strategy.mddocs/declarative_yaml_config.md