Command Reference

The Osmosis CLI is organized into two groups:

Workflow Commands — dataset, train, model, eval, rollout, init
Platform Commands — auth, workspace, upgrade

Run osmosis -h to see all available commands. Every sub-command supports -h / --help for detailed usage.

init

Initialize a new local Osmosis workspace directory with the standard project layout.

osmosis init <name> [--here]

Argument / Option	Type	Description
`name`	`str` (required)	Workspace name (used for directory name and config)
`--here`	flag	Initialize in current directory instead of creating a subdirectory

# Create a new workspace in a subdirectory
osmosis init my-project

# Initialize the current directory as a workspace
osmosis init my-project --here

See Workspace Overview for details on the generated directory structure.

dataset

Manage datasets — upload, list, preview, validate, and delete.

dataset upload

Upload a dataset file to the active workspace.

osmosis dataset upload <file>

Argument	Type	Description
`file`	`str` (required)	Path to the file to upload (CSV, JSONL, or Parquet; max 5 GB)

osmosis dataset upload data/train.jsonl

dataset list

List datasets in the active workspace.

osmosis dataset list [--limit N] [--all]

Option	Type	Default	Description
`--limit`	`int`	`50`	Maximum number of datasets to show
`--all`	flag	—	Show all datasets

--all and --limit are mutually exclusive.

dataset status

Check the processing status of a dataset.

osmosis dataset status <name>

Argument	Type	Description
`name`	`str` (required)	Dataset name

dataset preview

Preview rows from a dataset.

osmosis dataset preview <name> [--rows N]

Argument / Option	Type	Default	Description
`name`	`str` (required)	—	Dataset name
`--rows`	`int`	`5`	Number of rows to show

osmosis dataset preview my-dataset --rows 10

dataset validate

Validate a dataset file locally without uploading.

osmosis dataset validate <file>

Argument	Type	Description
`file`	`str` (required)	Path to the file to validate

dataset delete

Delete a dataset from the active workspace.

osmosis dataset delete <name> [--yes]

Argument / Option	Type	Description
`name`	`str` (required)	Dataset name
`-y`, `--yes`	flag	Skip confirmation prompt

train

Manage training runs — submit, monitor, export metrics, stop, and delete.

train submit

Submit a new training run from a TOML configuration file.

osmosis train submit <config_path> [--yes]

Argument / Option	Type	Description
`config_path`	`path` (required)	Path to training config TOML file
`-y`, `--yes`	flag	Skip confirmation prompt

osmosis train submit configs/training/default.toml

See Configuration Files for the full TOML schema.

The training run executes rollout code from your Git Sync’d repository, not your local workspace. Commit and push before submitting, or pin the run to a specific revision with commit_sha in your training config.

train list

List training runs in the active workspace.

osmosis train list [--limit N] [--all]

Option	Type	Default	Description
`--limit`	`int`	`50`	Maximum number of runs to show
`--all`	flag	—	Show all training runs

--all and --limit are mutually exclusive.

train status

Show details of a specific training run.

osmosis train status <name>

Argument	Type	Description
`name`	`str` (required)	Training run name

train metrics

Export training run metrics to a JSON file. Displays a summary table and metric trend graphs in the terminal.

osmosis train metrics <name> [--output PATH]

Argument / Option	Type	Default	Description
`name`	`str` (required)	—	Training run name
`-o`, `--output`	`str`	`.osmosis/metrics/`	Output path. A trailing `/` or existing directory generates a default filename inside it. Non-`.json` extensions are replaced.

# Export to default location (.osmosis/metrics/)
osmosis train metrics my-run

# Export to a specific file
osmosis train metrics my-run -o results/my-run.json

train stop

Stop a running training run.

osmosis train stop <name> [--yes]

Argument / Option	Type	Description
`name`	`str` (required)	Training run name
`-y`, `--yes`	flag	Skip confirmation prompt

train delete

Delete a training run and all associated data.

osmosis train delete <name> [--yes]

Argument / Option	Type	Description
`name`	`str` (required)	Training run name
`-y`, `--yes`	flag	Skip confirmation prompt

Deleting a training run permanently removes all metrics, logs, and checkpoints. This cannot be undone.

model

Manage models — list, delete, and upcoming deploy/export/build operations.

model list

List models in the active workspace.

osmosis model list [--limit N] [--all]

Option	Type	Default	Description
`--limit`	`int`	`50`	Maximum number of models to show
`--all`	flag	—	Show all models

model delete

Delete a model from the active workspace.

osmosis model delete <name> [--yes]

Argument / Option	Type	Description
`name`	`str` (required)	Model path (e.g. `google/gemma-2-9b-it`)
`-y`, `--yes`	flag	Skip confirmation prompt

You cannot delete a model that has active training runs depending on it. Delete those training runs first.

model deploy / model export / model build

osmosis model deploy
osmosis model export
osmosis model build

These commands are coming soon and are not yet functional. They are registered as placeholders in the CLI.

rollout

List rollouts synced from your workspace to the Osmosis platform.

rollout list

List rollouts in the active workspace.

osmosis rollout list [--limit N] [--all]

Option	Type	Default	Description
`--limit`	`int`	`50`	Maximum number of rollouts to show
`--all`	flag	—	Show all rollouts

eval

Evaluate agents against datasets — run evaluations, use LLM-as-judge rubrics, and manage the eval cache.

eval run

Evaluate an agent against a dataset using a TOML configuration file.

osmosis eval run <config_path> [options]

Argument / Option	Type	Default	Description
`config_path`	`str` (required)	—	Path to eval TOML config file
`--fresh`	flag	—	Discard cached results and re-run all rows
`--retry-failed`	flag	—	Re-run only previously failed rows
`--limit`	`int`	all rows	Max rows to evaluate
`--offset`	`int`	`0`	Skip first N rows
`-q`, `--quiet`	flag	—	Suppress progress output
`--debug`	flag	—	Enable debug logging and execution tracing
`-o`, `--output-path`	`str`	—	Override structured output directory
`--log-samples`	flag	—	Save full conversation logs to JSONL
`--batch-size`	`int`	from config	Override concurrent batch size

# Run full evaluation
osmosis eval run configs/eval/default.toml

# Re-run only failed rows with debug output
osmosis eval run configs/eval/default.toml --retry-failed --debug

See Configuration Files and Local Evaluation for more details.

eval rubric

Run LLM-as-judge evaluation: score conversations against a rubric using any LiteLLM-compatible model.

osmosis eval rubric [options]

Required options:

Option	Type	Description
`-d`, `--data`	`str`	Path to JSONL file with conversations
`-r`, `--rubric`	`str`	Rubric text (inline string) or `@file.txt` to read from a file
`--model`	`str`	Judge model in LiteLLM format (e.g. `openai/gpt-5.2`)

Optional:

Option	Type	Default	Description
`-n`, `--number`	`int`	`1`	Number of evaluation runs per record
`-o`, `--output`	`str`	—	Path to write evaluation results as JSON
`--api-key`	`str`	—	API key for the judge model
`--timeout`	`float`	—	Request timeout in seconds
`--score-min`	`float`	`0.0`	Minimum score value
`--score-max`	`float`	`1.0`	Maximum score value

# Inline rubric
osmosis eval rubric \
  -d data/conversations.jsonl \
  -r "Score the response on accuracy and completeness" \
  --model openai/gpt-5.2

# Rubric from file
osmosis eval rubric \
  -d data/conversations.jsonl \
  -r @rubrics/accuracy.txt \
  --model openai/gpt-5.2 \
  -o results/rubric-scores.json

eval cache dir

Print the eval cache root directory path.

osmosis eval cache dir

eval cache ls

List cached evaluations with optional filters.

osmosis eval cache ls [--model STR] [--dataset STR] [--status STR]

Option	Type	Description
`--model`	`str`	Filter by model name
`--dataset`	`str`	Filter by dataset path
`--status`	`str`	Filter by status: `in_progress` or `completed`

eval cache rm

Remove cached evaluations.

osmosis eval cache rm [task_id] [options]

Argument / Option	Type	Description
`task_id`	`str` (optional)	Task ID of a specific cache entry to delete
`--all`	flag	Delete all cached evaluations
`--model`	`str`	Filter by model name
`--dataset`	`str`	Filter by dataset path
`--status`	`str`	Filter by status: `in_progress` or `completed`
`-y`, `--yes`	flag	Skip confirmation prompt

# Remove a specific cached eval
osmosis eval cache rm abc123

# Remove all cached evals for a specific model
osmosis eval cache rm --model openai/gpt-5.2 --yes

auth

Manage authentication. See Installation & Authentication for full details.

Command	Description
`osmosis auth login`	Authenticate via browser OAuth or token
`osmosis auth logout`	Revoke session and clear credentials
`osmosis auth whoami`	Show current user and active workspace

workspace

Manage platform workspaces. See Installation & Authentication for full details.

Command	Description
`osmosis workspace`	Interactive TUI for browsing workspace contents
`osmosis workspace list`	List available workspaces
`osmosis workspace create <name>`	Create a new workspace
`osmosis workspace switch <name>`	Switch active workspace
`osmosis workspace delete <name>`	Delete a workspace

upgrade

Self-upgrade the CLI to the latest version published on PyPI.

osmosis upgrade

Auto-detects your install method (pip, pipx, or uv tool) and runs the appropriate upgrade command. Displays the currently installed version and the latest available version before upgrading.

CLI

Workspace

Rollout

Documentation Index

​init

​dataset

​dataset upload

​dataset list

​dataset status

​dataset preview

​dataset validate

​dataset delete

​train

​train submit

​train list

​train status

​train metrics

​train stop

​train delete

​model

​model list

​model delete

​model deploy / model export / model build

​rollout

​rollout list

​eval

​eval run

​eval rubric

​eval cache dir

​eval cache ls

​eval cache rm

​auth

​workspace

​upgrade

init

dataset

dataset upload

dataset list

dataset status

dataset preview

dataset validate

dataset delete

train

train submit

train list

train status

train metrics

train stop

train delete

model

model list

model delete

model deploy / model export / model build

rollout

rollout list

eval

eval run

eval rubric

eval cache dir

eval cache ls

eval cache rm

auth

workspace

upgrade