Runs | AXP Documentation

A run is one isolated execution of an experiment variant and the raw data it produces.

Every run is independent and immutable. AXP creates a new sandbox for each run and applies the variant's environment and product setup scripts. No memory or state is shared between runs, so one run cannot depend on files, environment variables, or side effects from another run.

Once the sandbox is ready, AXP gives the agent the variant's prompt and records what happens.

You can review the resulting run data to understand what the agent did, what changed in the workspace, which tests passed, how long it took, and what it cost. AXP uses run data to generate Results when you want to compare across variants, agents, prompts, products, or environments.

Triggering a run

Runs are created through the AXP CLI from an experiment YAML file. By default, AXP resolves the file and creates one run per (variant × repeat) for every variant in the experiment.

It's possible to override this default behavior with the --variant flag to run only a targeted subset of variants. This is useful when you want to rerun the variant behind a failed run without rerunning the whole experiment.

For the exact flags that control which variants and repeats run, see the run command reference. (For local execution, see axp local run.)

Remote vs. local execution

You can execute the same experiment remotely through the AXP platform or locally via containers on your own machine. The command you use decides where the sandboxes live and where the run data is stored.

Remote runs (via `axp run`)

Remote runs execute variants in AXP-managed sandboxes. AXP can run those sandboxes in parallel, which is useful for experiments with many variants or many repeats. Remote runs require you to be signed in (via axp auth login) and you do not need to provide your own model provider API key for supported agents.

axp run polls the run to completion and prints a run id. To check on a run later — including one you submitted with --detach or detached from with Ctrl-C — pass that id to axp runs status <id> (add --watch to re-attach and stream until it finishes), or list recent runs with axp runs list.

See axp run for the full command reference.

Local runs (via `axp local run`)

Local runs execute variants in Docker containers on your own machine. You can run those containers in parallel, but concurrency is limited by your machine's CPU, memory, and Docker capacity. Local runs do not require you to be signed in. If you are signed in, you can pass --managed-model-access; otherwise, provide your own model provider API key.

See axp local run for the full command reference.

Inspecting run data

Each run produces the data you use to answer two questions: what happened in this sandbox, and how did this run compare to the others?

final variant configuration
setup, agent, command, and test logs
agent messages and tool calls
workspace file changes
status, timing, token usage, and cost

Run data from remote runs is stored on the AXP platform. You and other members of your organization can inspect it in the AXP dashboard, query it with axp query, or download it for local analysis with axp download.

For local runs, AXP attempts a best-effort upload when you're signed in with axp auth login. If that upload is skipped or fails, the run data stays on your machine under ./.axp/runs/. You can upload it later with axp upload.

See the Results documentation for more information on how to use run data to generate insights and comparisons.

Search documentation

Triggering a run

Remote vs. local execution

Remote runs (via axp run)

Local runs (via axp local run)

Inspecting run data

Remote runs (via `axp run`)

Local runs (via `axp local run`)