Docker in Docker
Run a Docker daemon and containers inside an experiment's sandbox.
Some experiments need Docker inside the sandbox. The AXP experimental sandbox supports this. However, Docker's defaults conflict with the sandbox network and there are complexities with the Debian package.
Running docker in the AXP sandbox
- Install both
docker.ioanddocker-cliinsetup. - Start
dockerdwith--iptables=false --bridge=none. - Run containers with
--network none.
environments:
- name: docker-host
setup:
- name: install-docker
script: |
set -eux
export DEBIAN_FRONTEND=noninteractive
apt-get update
# docker.io is the daemon; docker-cli is the `docker` client (see pitfall below)
apt-get install -y --no-install-recommends docker.io docker-cli
docker --versionSetup runs as root (the harness wraps it in sudo -E bash), so sudo inside a setup script is optional. Then start the daemon and run a container — either from a later setup step or, if you want the agent to do it, from the prompt:
# detach the daemon so it survives into the agent / test phases
setsid dockerd --iptables=false --bridge=none >/tmp/dockerd.log 2>&1 </dev/null &
for i in $(seq 1 30); do docker info >/dev/null 2>&1 && break; sleep 1; done
docker run --rm --network none hello-worldPitfalls
--no-install-recommends docker.io does not install the docker CLI. On the Debian base image, the docker client binary lives in the separate docker-cli package, which is only a recommends of docker.io. With --no-install-recommends you get the daemon but docker is "command not found", and the next docker … line fails. Install docker.io docker-cli explicitly (or drop --no-install-recommends).
dockerd rewrites iptables and creates a bridge — which can sever the sandbox's own outbound network. Start it with --iptables=false --bridge=none so it leaves the host firewall and routing alone. Containers then get no network of their own, so run them with --network none (fine for most build / CLI / hello-world tasks). Do not start dockerd as a side effect of installing the package — keep setup to the install only, or guard it. If you genuinely need container networking you are in deeper waters; add it incrementally and verify the sandbox can still reach the model API.
Validate setup cheaply with --mock before spending tokens. axp run --mock <experiment.yaml> runs setup + setup_checks + tests with a no-op agent — no model spend. setup runs under set -e, so any non-zero exit (for example a docker --version check when the CLI package is missing) aborts the variant before the agent runs; if a --mock run fails at the setup phase, fix your setup before a real agent run. To capture what setup did, write it to a file under /workspace and cat it from a test.
Don't expect a daemon to outlive the step that started it. Detach it with setsid … &. Because tests run in a later exec, have the agent write container output to a file under /workspace and assert on the file rather than relying on a live daemon at test time.
Verifying it worked
Have the agent write the container's output to a file under /workspace, then assert on that file from a test — rather than relying on a live daemon at test time.
Complete example
A full, runnable experiment that ties the pieces together:
schema_version: 2
id: docker-in-docker
name: "Docker-in-Docker: agent starts a container"
agents:
- name: claude
model: anthropic/claude-sonnet-4.6
prompts:
- id: start-container
prompt: |
Docker is installed but no daemon is running. You have passwordless sudo.
Start the daemon in the background with
`sudo dockerd --iptables=false --bridge=none`, wait for `sudo docker info`
to report a Server, then run `sudo docker run --rm --network none hello-world`.
Write the container's full stdout to /workspace/container-output.txt and the
output of `sudo docker info` to /workspace/docker-info.txt.
environments:
- name: docker-host
setup:
- name: install-docker
script: |
set -eux
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get install -y --no-install-recommends docker.io docker-cli
docker --version
tests:
application:
- name: hello-world-ran
script: grep -q "Hello from Docker!" /workspace/container-output.txt
- name: docker-daemon-came-up
script: grep -qi "Server Version" /workspace/docker-info.txt
limits:
max_turns: 30
max_time_seconds: 1200
max_cost_usd: 2.00