> ## Documentation Index
> Fetch the complete documentation index at: https://docs.osmosis.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Configuration Files

> Reference TOML configuration files used by the Osmosis CLI

The Osmosis CLI uses TOML files for evaluation runs and training runs. Configs must live inside the workspace directory:

| Config type | Required location         | Command                |
| ----------- | ------------------------- | ---------------------- |
| Eval        | `configs/eval/*.toml`     | `osmosis eval submit`  |
| Training    | `configs/training/*.toml` | `osmosis train submit` |

<Note>
  Required fields are shown un-commented. Optional fields are commented out in template files and can be omitted to use platform defaults.
</Note>

***

## Eval Config

Used by [`osmosis eval submit`](/cli/command-reference#eval-submit) to submit an evaluation run. The platform clones the workspace repository identified by the `origin` remote and runs the rollout server-side against a platform dataset.

```toml configs/eval/my-rollout.toml theme={"theme":{"light":"github-light","dark":"github-dark"},"languages":{"custom":["/languages/cli.json"]}}
[experiment]
rollout = "my-rollout"                         # Rollout directory under rollouts/
entrypoint = "main.py"                         # Entrypoint relative to rollout dir
model_path = "openai/gpt-5-mini"               # LiteLLM-style model name for the evaluation policy
dataset = "my-platform-dataset"                # Platform dataset name from `osmosis dataset list`
# commit_sha = "abc123..."                     # Pin code to a specific commit

[evaluation]
# Optional. Omit values to use platform defaults.
# limit = 200                                  # First N rows; omit for random 10% sample
# n = 1                                        # Evaluation attempts per row
# batch_size = 1                               # Rows evaluated per batch
# pass_threshold = 1.0                         # Minimum passing score
# agent_workflow_timeout_s = 450               # Agent workflow timeout per row
# grader_timeout_s = 150                       # Grader timeout per row

# [env]
# LOG_LEVEL = "INFO"                           # Non-secret literal env var

[secrets]
# Required for eval configs. Default OpenAI eval models need this.
# Use required = [] when the evaluation needs no secret refs.
required = ["OPENAI_API_KEY"]
```

### `[experiment]`

| Field        | Type  | Required | Description                                                                                |
| ------------ | ----- | -------- | ------------------------------------------------------------------------------------------ |
| `rollout`    | `str` | Yes      | Rollout directory name under `rollouts/`                                                   |
| `entrypoint` | `str` | Yes      | Python entrypoint relative to the rollout directory                                        |
| `model_path` | `str` | Yes      | LiteLLM-style model name for the evaluation policy (e.g. `openai/gpt-5-mini`)              |
| `dataset`    | `str` | Yes      | Platform dataset name from `osmosis dataset list`                                          |
| `commit_sha` | `str` | No       | Pin code to a specific commit. Defaults to the latest synced commit on the default branch. |

### `[evaluation]`

All fields are optional. Omit values to use platform defaults.

| Field                      | Type    | Description                                                                                                               |
| -------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------- |
| `limit`                    | `int`   | Number of rows to evaluate (the first `N` rows). When omitted, the platform evaluates a random 10% sample of the dataset. |
| `n`                        | `int`   | Number of evaluation attempts per row (use values > 1 for pass\@n metrics)                                                |
| `batch_size`               | `int`   | Rows evaluated per batch                                                                                                  |
| `pass_threshold`           | `float` | Score at or above which a sample counts as passing                                                                        |
| `agent_workflow_timeout_s` | `float` | Timeout for `AgentWorkflow.run()` per row                                                                                 |
| `grader_timeout_s`         | `float` | Timeout for `Grader.grade()` per row                                                                                      |

### `[env]` and `[secrets]` (evaluation)

Optional `[env]` variables and a required `[secrets]` table for the evaluation run container. Eval configs must include `[secrets]` — use `required = []` only when the evaluation needs no secret refs. See [`[env]` and `[secrets]`](#env-and-secrets) below for the full ruleset.

***

## Training Config

Used by [`osmosis train submit`](/cli/command-reference#train-submit) to submit a training run.

```toml configs/training/my-rollout.toml theme={"theme":{"light":"github-light","dark":"github-dark"},"languages":{"custom":["/languages/cli.json"]}}
[experiment]
rollout = "my-rollout"                         # Rollout directory under rollouts/
entrypoint = "main.py"                         # Entrypoint file name
model_path = "Qwen/Qwen3.6-35B-A3B"            # Supported base model
dataset = "my-dataset"                         # Platform dataset name
# commit_sha = "abc123..."                     # Pin code to a commit

[training]
# lr = 1e-6                                    # Learning rate
# total_epochs = 1                             # Training epochs
# n_samples_per_prompt = 8                     # Rollout samples per prompt
# rollout_batch_size = 32                      # Rollout batch size
# max_prompt_length = 8192                     # Max prompt tokens
# max_response_length = 8192                   # Max response tokens
# agent_workflow_timeout_s = 450               # Agent timeout per row
# grader_timeout_s = 150                       # Grader timeout per row

[sampling]
# rollout_temperature = 1.0                    # Sampling temperature
# rollout_top_p = 1.0                          # Top-p sampling

[checkpoints]
# eval_interval = 10                           # Evaluate every N rollouts
# checkpoint_save_freq = 20                    # Save checkpoint every N rollouts

# [advanced]
# Backend-specific fields. Use only when instructed by Osmosis support.

# [env]
# LOG_LEVEL = "INFO"                           # Non-secret literal env var

# [secrets]
# required = ["OPENAI_API_KEY"]                # Optional in training; if set, must include `required`
```

<Warning>
  Git Sync is the source of truth for your rollout code. The CLI reads config values from the local TOML file you pass, but rollout code comes from the synced workspace repository. Commit, push, and wait for sync before submitting code changes; set `commit_sha` when you need a specific synced revision.
</Warning>

### `[experiment]`

| Field        | Type  | Required | Description                                           |
| ------------ | ----- | -------- | ----------------------------------------------------- |
| `rollout`    | `str` | Yes      | Rollout directory name under `rollouts/`              |
| `entrypoint` | `str` | Yes      | Python entrypoint file name, usually `main.py`        |
| `model_path` | `str` | Yes      | Supported base model path                             |
| `dataset`    | `str` | Yes      | Dataset name from `osmosis dataset list`              |
| `commit_sha` | `str` | No       | Git commit SHA to fetch from the workspace repository |

### `[training]`

| Field                      | Type    | Default          | Description                          |
| -------------------------- | ------- | ---------------- | ------------------------------------ |
| `lr`                       | `float` | platform default | Learning rate                        |
| `total_epochs`             | `int`   | platform default | Number of training epochs            |
| `n_samples_per_prompt`     | `int`   | platform default | Rollout samples generated per prompt |
| `rollout_batch_size`       | `int`   | platform default | Prompts processed per rollout batch  |
| `max_prompt_length`        | `int`   | platform default | Maximum prompt tokens                |
| `max_response_length`      | `int`   | platform default | Maximum response tokens              |
| `agent_workflow_timeout_s` | number  | platform default | Agent rollout timeout per row        |
| `grader_timeout_s`         | number  | platform default | Grader timeout per row               |

### `[sampling]`

| Field                 | Type   | Default          | Description                          |
| --------------------- | ------ | ---------------- | ------------------------------------ |
| `rollout_temperature` | number | platform default | Sampling temperature during rollouts |
| `rollout_top_p`       | number | platform default | Top-p sampling threshold             |

### `[checkpoints]`

| Field                  | Type  | Default          | Description                                  |
| ---------------------- | ----- | ---------------- | -------------------------------------------- |
| `eval_interval`        | `int` | platform default | Evaluate every N rollout steps               |
| `checkpoint_save_freq` | `int` | platform default | Save a LoRA checkpoint every N rollout steps |

### `[advanced]`

Optional backend-specific fields. The CLI preserves unknown keys in this section and the platform validates them server-side.

<a id="env-and-secrets" />

### `[env]` and `[secrets]`

Use these sections to inject environment variables into the rollout container during training runs or evaluation runs. The same shape applies to both training and evaluation configs.

| Section              | Values                                             | Use for                          |
| -------------------- | -------------------------------------------------- | -------------------------------- |
| `[env]`              | Literal strings stored in the config file          | Non-secret configuration         |
| `[secrets].required` | List of platform `environment_secret` record names | API keys and private credentials |

```toml theme={"theme":{"light":"github-light","dark":"github-dark"},"languages":{"custom":["/languages/cli.json"]}}
[env]
LOG_LEVEL = "INFO"

[secrets]
required = ["OPENAI_API_KEY", "DATABASE_URL"]
```

Rules:

* `[env]` keys must match `^[A-Z_][A-Z0-9_]*$`; `[secrets].required` names must match `^[A-Z][A-Z0-9_]*$`.
* The same name cannot appear in both `[env]` and `[secrets].required`.
* `[env]` keys starting with `_OSMOSIS_` are reserved by the platform and cannot be used.
* `[secrets].required` entries are record names only. The platform resolves each name to its encrypted value server-side and injects it as an env var of the same name. Secret values never appear in the config file, the API payload, or CLI output.
* Eval configs must include `[secrets]`. Use `required = []` only when the evaluation needs no secret refs.
* Training configs may omit `[secrets]`. If you include the table, it must define `required`.

<Note>
  Secrets are scoped. A **workspace** secret is shared across the workspace; a **personal** secret is private to you and overrides the workspace secret of the same name at run time. Register secrets with [`osmosis secret set`](/cli/command-reference#secret) before submitting a run that references them.
</Note>

<Tip>
  Start with only `[experiment]` (plus `[secrets]` for eval configs) and let the platform use training defaults. Add optional fields only when you need to tune a run.
</Tip>