Experiment YAML
Lookup reference for the AXP experiment YAML schema: fields, object shapes, shorthand forms, constraints, and validation rules.
Experiment YAML is the file format passed to axp run and axp local run.
The current parser accepts schema_version: 2 and rejects unknown fields at every level. For the canonical machine-readable schema, use experiment.v2.schema.yaml.
Schema URL
Use the version-pinned schema URL for editor validation:
# yaml-language-server: $schema=https://docs.514.ai/schema/experiment.v2.schema.yamlPublished schema URLs:
| URL | Use |
|---|---|
https://docs.514.ai/schema/experiment.schema.yaml | Latest supported schema |
https://docs.514.ai/schema/experiment.v2.schema.yaml | Version-pinned v2 schema |
https://docs.514.ai/schema/experiment.v1.schema.yaml | Legacy v1 schema. v1 experiments no longer run; this remains available so older files that pin it still validate in-editor while migrating to v2. |
axp experiment schema prints the latest supported schema to stdout.
Field index
| Object | Fields |
|---|---|
| Top-level experiment | schema_version, id, name, description, agents, prompts, environments, products, extensions, environment_variables, secrets, files, tests, limits |
| Agent | name, model |
| Model | name, effort, context_window_size, thinking, fast |
| Prompt | id, prompt, description, tags |
| Environment | name, setup, description, tags, commit |
| Product | name, type, setup, version, commit, description, tags |
| Setup | name, script, description, tags, files, environment_variables, secrets, mcp_servers, setup_checks |
| Extension | id, description, tags, agents, prompts, environments, products, extensions |
| Environment variable | name, value |
| File entry | name, source, sha256, dest |
| Test | name, script |
| Setup check | name, script |
| MCP server | name, type, command, args, url, env, headers |
| MCP stdio env entry | name, from |
| MCP HTTP and SSE header | name, value |
| Limits | max_turns, max_time_seconds, max_cost_usd |
Minimal example
# yaml-language-server: $schema=https://docs.514.ai/schema/experiment.v2.schema.yaml
schema_version: 2
id: cli-install
name: "CLI install"
agents:
- name: claude
model: anthropic/claude-sonnet-4.6
prompts:
- id: install
prompt: |
Install the CLI and write its version to /workspace/version.txt.
products:
- name: cli
type: CLI
setup: npm pack
tests:
application:
- name: version-file-exists
script: test -f /workspace/version.txt
limits:
max_turns: 25
max_time_seconds: 300
max_cost_usd: 0.50Top-level experiment
An experiment defines what you want to learn about agents using a product surface. It names the agents, prompts, products, environments, and tests AXP uses to compute and run variants.
| Field | Required | Type / shape | Notes |
|---|---|---|---|
schema_version | Yes | Integer | Must be 2. |
id | Yes | Kebab-case string | Stable experiment id. Should match the YAML file name without .yaml. |
name | Yes | String | Human-readable experiment name. |
description | No | String | Optional context used when analyzing outcomes. |
agents | No | Agent axis | Optional at the top level only if an extension supplies agents. Every resolved variant must have an agent. |
prompts | No | Prompt axis | Optional at the top level only if an extension supplies prompts. Every resolved variant must have a non-empty prompt. |
environments | No | Environment axis | Optional. |
products | No | Product axis | Optional. |
extensions | No | Extension list | Optional. If present, only extension-derived variants are created. |
environment_variables | No | Environment variable list | {name, value} entries injected into every variant. Values can be literals or axp://secrets/<slug> references resolved from your org secret store. |
secrets | No | String list | Deprecated alias for host environment-variable names. Prefer environment_variables. |
files | No | File entry list | Host files staged into every variant before setup runs. |
tests | Yes | Tests object | Must contain at least one application or introspection test. |
limits | Yes | Limits object | Run caps. |
Top-level environment_variables, secrets, and files apply to every variant. MCP servers and setup checks are setup-owned fields.
Variant axes
Variant axes are the experiment inputs AXP combines into runnable variants. This lets you define the dimensions you care about once, then compare results by agent, prompt, environment, and product.
variants = agents × prompts × environments × productsenvironments and products are optional. If an axis is absent, it is omitted from the variant coordinate.
Agents
agents defines which coding agents run the experiment. Each agent can use its default model or pin an explicit model and model controls.
Accepted shapes:
# String form: one agent, provider-default model.
agents: claude
# List of strings: multiple agents, provider-default models.
agents:
- claude
- codex
# Object form: one agent with an explicit model.
agents:
name: claude
model: anthropic/claude-sonnet-4.6
# List of objects: multiple explicit agent/model pairs.
agents:
- name: claude
model: anthropic/claude-sonnet-4.6model is optional but recommended. If omitted, AXP uses the provider-default model for that agent.
Agent object
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | claude, codex, or cursor | Coding agent to run. |
model | No | Model id string or model object | Bare agent names use provider defaults. |
Model object
A model object configures the model used by an agent. Optional controls are passed through when the selected agent supports them.
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | String | Provider/model id, such as anthropic/claude-opus-4.8 or openai/gpt-5. |
effort | No | low, medium, high, x-high, max | Reasoning effort control. |
context_window_size | No | String | Context window hint, such as 1M or 200k. |
thinking | No | Boolean | Enables thinking mode when supported. |
fast | No | Boolean | Enables fast mode when supported. |
agents:
- name: claude
model:
name: anthropic/claude-opus-4.8
effort: high
context_window_size: 1M
thinking: true
- name: codex
model: openai/gpt-5Prompts
prompts defines the tasks agents try to complete. Each resolved variant receives one final prompt.
Accepted shapes:
# String form: one prompt.
prompts: "Build the report."
# List of strings: multiple prompts.
prompts:
- "Build the report."
- "Build the report and explain your steps."
# Object form: list of named prompts.
prompts:
- id: detailed
prompt: "Build the report and explain your steps."Prompt object
A prompt object gives a task stable metadata, so Results can filter and group by the prompt that produced each run.
| Field | Required | Type / values | Notes |
|---|---|---|---|
id | Yes | Kebab-case string | Stable prompt id. Recorded for filtering in Results. |
prompt | Yes | String | Task text given to the agent. |
description | No | String | Human-readable notes. |
tags | No | String list | Free-form labels. |
Bare prompt strings get positional ids: p0, p1, and so on.
Environments
environments defines the sandbox conditions around the agent, such as installed tools, fixtures, or external integrations. Use environments when you want to compare how agents perform under different surrounding conditions.
Accepted shapes:
# String form: one setup script with a generated environment name.
environments: "pip install -r requirements.txt"
# Object form: one named environment.
environments:
name: workspace
setup: "pip install -r requirements.txt"
# List form: multiple named environments.
environments:
- name: workspace
setup: "pip install -r requirements.txt"Environment object
An environment object names one sandbox setup condition. Its setup prepares the sandbox before the agent receives the prompt.
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Kebab-case string | Environment coordinate. Recorded for filtering in Results. |
setup | Yes | Setup | Prepares the sandbox before the agent runs. |
description | No | String | Human-readable notes. |
tags | No | String list | Free-form labels. |
commit | No | String | Source commit of the environment under test. |
Bare environment strings are shorthand for an environment whose setup is that string.
Products
products defines the agent-facing surface under test. A product can be a CLI, API, MCP server, SDK, docs surface, or other tool the agent uses to complete the prompt.
Accepted shapes:
# String form: one setup script with a generated product name.
products: "npm install -g my-cli"
# Object form: one named product.
products:
name: my-cli
type: CLI
version: "1.2.0"
setup: "npm install -g my-cli"
# List form: multiple named products.
products:
- name: my-cli
type: CLI
version: "1.2.0"
setup: "npm install -g my-cli"Product object
A product object names what you want to compare. Product metadata is recorded so Results can filter and group by product and product version.
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Kebab-case string | Product coordinate. Recorded for filtering in Results. |
type | No | Product type | Defaults to Other. |
setup | Yes | Setup | Prepares the product before the agent runs. |
version | No | String | Product version. Quote numeric versions, such as "25.3". |
commit | No | String | Source commit of the product under test. |
description | No | String | Human-readable notes. |
tags | No | String list | Free-form labels. |
Product type
Allowed values: CLI, MCP, API, Skill, SDK, Schema, Docs, Marketing, Agents.md, Other.
Setup
setup prepares the sandbox for an environment or product. Use setup scripts to install tools, create fixtures, start services, or expose resources the agent needs.
Each variant has its own isolated /workspace. A setup script cannot rely on files or side effects created by another variant.
Accepted shapes:
# String form: one setup script.
setup: "npm install"
# List of strings: multiple setup scripts, run in order.
setup:
- "npm install"
- "npm test -- --help"
# Object form: one named setup with optional scoped resources.
setup:
name: install-cli
script: "npm install"
# List of objects: multiple named setups, run in order.
setup:
- name: install-cli
script: "npm install"
- name: smoke-cli
script: "npm test -- --help"
# Mixed list: strings and setup objects can be combined.
setup:
- "npm install"
- name: smoke-cli
script: "npm test -- --help"Setup object
A setup object is the named form of setup. Use it when setup needs its own files, environment variables, MCP servers, or setup checks.
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Kebab-case string | Required for object form. |
script | Yes | String | Bash run before setup checks and before the agent. Secret values are not injected here. |
description | No | String | Human-readable notes. |
tags | No | String list | Free-form labels. |
files | No | File entry list | Accepted on setup objects, but current runs only stage top-level files. Use top-level files when the run needs host files delivered. |
environment_variables | No | Environment variable list | Runtime env vars scoped to variants that use this setup. |
secrets | No | String list | Deprecated. |
mcp_servers | No | MCP server list | MCP servers exposed to the agent. |
setup_checks | No | Setup check list | Checks run after setup and before the agent. |
When both a product and environment contribute setup, product setup runs before environment setup.
Setup checks
setup_checks verify that setup produced a usable sandbox before the agent starts. A failed setup check stops that variant before any agent work happens.
setup_checks:
- name: cli-on-path
script: my-cli --versionSetup check object
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Kebab-case string | Appears in setup-checks/<name>.json. |
script | Yes | String | Bash check. A non-zero exit aborts the variant before the agent runs. |
MCP servers
mcp_servers exposes MCP servers to the agent for variants that use this setup. Use this field when the product surface or environment includes an MCP tool.
mcp_servers:
- name: fixture-sentinel
type: stdio
command: /workspace/fixture-mcp.py
args: []
- name: axp
type: http
url: http://localhost:3001/mcpMCP server object
An MCP server object defines one server and its transport. The required connection fields depend on type.
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | String | Must be unique within the setup's mcp_servers. |
type | Yes | stdio, http, or sse | Transport. |
command | For stdio | String | Executable path or command inside the sandbox. |
args | No | String list | Stdio command arguments. |
url | For http / sse | String | MCP endpoint URL. |
env | No | MCP stdio env entries | Only valid for stdio. |
headers | No | MCP header entries | Only valid for http / sse. |
Transport-specific rules:
stdiousescommand, optionalargs, and optionalenv.httpandsseuseurland optionalheaders.- Mixing stdio-only and endpoint-only fields is rejected.
MCP stdio env entry
env forwards declared secret values to a stdio MCP process. Each entry is either a bare secret name or an object:
env:
- GITHUB_TOKEN
- name: GH_AUTH
from: GITHUB_TOKEN| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | String | Env var name as seen by the MCP server process. |
from | Yes | Secret name | Declared secret name whose value is forwarded. |
MCP HTTP and SSE header object
headers attaches HTTP headers to an http or sse MCP server. Use placeholders when a header needs a declared secret value.
headers:
- name: Authorization
value: "Bearer ${SUPABASE_SERVICE_ROLE_KEY}"| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | String | HTTP header name. Header names must be unique per server, case-insensitively. |
value | Yes | String | May contain ${SECRET_NAME} placeholders. Bare $NAME is literal. |
MCP env entries and header placeholders reference environment variable names visible to the variant. Values can come from literal environment_variables, axp://secrets/<slug> references, or the deprecated secrets list.
Rules enforced at axp experiment validate time:
- Every stdio
env[*].fromand every${NAME}placeholder in a headervaluemust reference a name visible to the variant. - Bare
$NAMEis treated as a literal; only${NAME}is a placeholder. - HTTP header names are unique per server, case-insensitively.
command/args/envare only valid fortype: stdio;url/headersare only valid fortype: http/sse.
Resolved secret values are written into the agent session frame and can appear in run artifacts. Treat artifacts as sensitive whenever an experiment forwards secrets to MCP servers.
Extensions
extensions refine the variant set when the base axes do not express the combinations you need. Use extensions to narrow an axis, swap products or environments for a subtree, or append prompt guidance to a slice of variants.
If an experiment declares any extensions, AXP creates only extension-derived variants.
Extension object
An extension object is one node in the refinement tree. Nested extensions are recursively cross-multiplied with their parents.
| Field | Required | Type / values | Notes |
|---|---|---|---|
id | Yes | Kebab-case string | Must be unique among sibling extensions. |
description | No | String | Human-readable notes. |
tags | No | String list | Added to resolved variant tags. |
agents | No | Agent list | Replaces inherited agents for this extension subtree. |
prompts | No | Prompt list | Appended to inherited prompt text. Does not replace the prompt axis. |
environments | No | Environment list | Replaces inherited environments for this extension subtree. |
products | No | Product list | Replaces inherited products for this extension subtree. |
extensions | No | Extension list | Nested extensions. |
prompts:
- id: analyze
prompt: "Read /workspace/task.md and write /workspace/report.json."
extensions:
- id: with-cli
products:
- name: cli
type: CLI
setup: "curl -fsSL https://clickhouse.com/ | sh"
prompts: ["Use the ClickHouse CLI for the analysis."]
- id: without-cli
products:
- name: no-cli
setup: "true"
prompts: ["Do not use the ClickHouse CLI; use another local method."]Environment variables
environment_variables injects environment variables into variants at runtime. Use it for non-secret runtime configuration and supported secret references.
environment_variables:
- name: LOG_LEVEL
value: debug
- name: GITHUB_TOKEN
value: axp://secrets/prod-gh
- name: DATABASE_URL
value: axp://secrets/staging-dbEnvironment variable object
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Env-var name | Must match ^[A-Z_][A-Z0-9_]*$. Harness-reserved names and prefixes are rejected. |
value | Yes | String | Literal value or axp://secrets/<slug> reference. Other values beginning with axp:// are rejected. |
Store secret values once with axp secrets set <slug>, then reference them with axp://secrets/<slug>. Both axp run and axp local run resolve these references before injecting env vars into the sandbox.
A referenced slug that does not exist in your org fails at preflight.
Reserved names:
ANTHROPIC_API_KEYANTHROPIC_BASE_URLOPENAI_API_KEYOPENAI_BASE_URLCURSOR_API_KEYMODELMAX_TURNSIS_SANDBOXTRACEPARENT- any name beginning with
AXP_,CLAUDE_CODE_,CODEX_,CURSOR_, orOTEL_
Files
Top-level files stages host files or directories into every variant's /workspace before setup runs.
setup.files is accepted in experiment YAML, but current runs only stage top-level files. Put host file staging at the top level when you need the files delivered during a run.
files:
- name: my-cli
source: ../build/mycli
dest: tools/mycli
- source: https://example.com/fixtures/data.bin
sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
dest: fixtures/data.binFile entry object
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Sometimes | Kebab-case string | Required when source is omitted. Used as the --file NAME=SOURCE handle. |
source | Sometimes | Host path or http(s) URL | Required when name is omitted. Relative paths resolve against the YAML file's directory. |
sha256 | No | 64-char hex string | Valid for file sources and URL downloads. Invalid for directory sources. |
dest | Yes | Workspace-relative path | Absolute paths, .., .axp-bridge, and :: are rejected. |
Notes:
- Directory sources copy their contents under
dest/. - File sources land as a single file at
dest. - URL sources must be publicly fetchable and land as a single file at
dest; unpack archives insetupif needed. - Sandboxes are Linux containers; macOS binaries staged from a laptop will not run there.
- Directory walks honor
.axpignore, not.gitignore. --file NAME=SOURCEbinds or overrides the source of a named entry.--file SOURCE::DESTstages an ad-hoc entry into every variant.- Missing or unbound sources abort real runs at preflight.
--resolve-variantsrendersMISSING/UNBOUNDannotations without failing. - Staging failures roll the variant up as
status=error/exit_reason=staging_failed; the rest of the run keeps going. - Platform and local runs both stage top-level
files. - Treat experiment YAML like a script you run: sources are read with your permissions and may point anywhere on the host.
Tests
tests defines how AXP scores each run. Application tests check the output state; introspection tests check how the agent got there.
tests:
application:
- name: report-exists
script: test -f /workspace/report.json
introspection:
- name: under-thirty-tool-calls
script: '[ "$(jq ".tool_calls | length" "$AXP_TRACE_PATH")" -lt 30 ]'| Field | Required | Type / values | Notes |
|---|---|---|---|
application | No | Test list | Checks resulting application state, files, commands, or endpoints. |
introspection | No | Test list | Checks agent behavior through trace artifacts such as AXP_TRACE_PATH. |
Test object
| Field | Required | Type / values | Notes |
|---|---|---|---|
name | Yes | Kebab-case string | Must be globally unique across application and introspection tests. |
script | Yes | String | Bash script. Streamed over stdin and not shown to the agent. |
Limits
limits sets the run caps for each variant execution.
Limits object
| Field | Required | Type / values | Notes |
|---|---|---|---|
max_turns | Yes | Integer greater than 0 | Agent turn cap. |
max_time_seconds | Yes | Integer greater than 0 | Wall-clock timeout in seconds. |
max_cost_usd | Yes | Number greater than 0 | Enforced when the agent reports cumulative cost during the run; if no cost is reported, AXP cannot stop on cost. |
Secrets deprecated
secrets is a deprecated alias for host environment variable names injected into variants at runtime.
Prefer environment_variables for literal values. Some MCP secret-forwarding fields still reference declared secret names.
secrets:
- GITHUB_TOKENSecret names must match ^[A-Z_][A-Z0-9_]*$.
Validation rules
An experiment is invalid if:
- the YAML contains a field not defined by the schema
schema_versionis not2- any required field is missing
- an id that must be kebab-case is not kebab-case
- an agent model id contains
:: - a declared axis is empty
- duplicate ids or names appear where uniqueness is required
- the resolved variant set is empty
- a resolved variant has an empty prompt
- two resolved variants collide on
variant_id - no tests are defined
- test names are duplicated
- an env var or secret name is invalid or reserved
- a file entry has an invalid
name,source,sha256, ordest - an MCP server mixes transport-specific fields
- an MCP server references a secret not visible to the variant
- any limit is not greater than zero
YAML syntax boundaries
The experiment data model is JSON-compatible even though the authoring file is YAML.
- YAML comments are allowed.
- YAML anchors and aliases are allowed when the resolved value is JSON-compatible.
- Custom YAML tags are unsupported.
- Non-string mapping keys are unsupported.