Skip to main content
Use this path when you already have a task, dataset, or existing agent code and want an AI coding agent to help turn it into a runnable Osmosis rollout. If you are new to Osmosis and want the shortest copy-paste path, start with Run the Multiply Example instead.
This guide assumes you have completed Onboarding: your workspace repository is cloned, the CLI is installed and authenticated, and your AI coding environment is open in the workspace directory.

What Workspace Skills Do

Platform-created workspace repositories include project-local Agent Skills under .agents/skills/. Agents that support the open Agent Skills format can use those skills to move through the same loop an experienced Osmosis user would follow:
plan from dataset -> create rollout -> submit evaluation run -> debug failures -> prepare training run
It is not a replacement for the CLI. The agent still uses the Osmosis CLI as the source of truth for workspace checks, dataset validation, evaluation runs, and training run preflight.

When to Use This Path

Use this path whenRun the multiply example when
You already know the task you want to train onYou want proof that the platform works end to end
You have sample data or a platform datasetYou do not want to design a dataset yet
You want the agent to create or adapt rollout codeYou want to copy commands without making product choices
You are comfortable inspecting generated code and evaluation run outputYou are still learning the Osmosis workflow

Use the Skills in Your Workspace Repository

Open your platform-created workspace repository in your AI coding environment. The repository includes the workspace contract and Agent Skills alongside rollout code, configs, and data:
repository/
├── .agents/
│   └── skills/
├── .claude/
│   └── skills/
├── rollouts/
├── configs/
│   ├── eval/
│   └── training/
├── data/
├── AGENTS.md
├── CLAUDE.md
└── pyproject.toml
AGENTS.md contains the always-loaded workspace contract. .agents/skills/ contains the canonical workflow skills, and .claude/skills/<skill-name> exposes the same skills to Claude Code through symlinks back to .agents/skills/.

Start in a Workspace Repository

The skills assume this repository layout for source files:
repository/
├── rollouts/
├── configs/
│   ├── eval/
│   └── training/
├── data/
└── pyproject.toml
Before asking the agent to write rollout code, confirm that the CLI can resolve the workspace:
osmosis doctor
osmosis auth whoami

Ask the Agent to Plan from the Dataset

Start by describing your task and asking the agent to begin with the workspace’s planning skill. A useful first prompt is:
I want to train a model for <task> in this Osmosis workspace. Start with the `plan-training` skill: read the workspace instructions, help me settle the dataset plan, and propose the next step before creating rollouts, running evaluation runs, or submitting a training run.
The workspace skills should guide the agent to:
1

Plan training

Inspect data/, existing rollouts, and workspace config. The agent should settle the dataset schema before writing rollout code.
2

Create or adapt a rollout

Write the smallest AgentWorkflow and Grader that can load, run, and score samples. Generated files should stay under rollouts/, configs/eval/, configs/training/, and data/.
3

Submit an evaluation run

Push the rollout to the workspace repository and submit an evaluation run as the quality gate:
git push
osmosis eval submit configs/eval/<name>.toml
osmosis eval info <eval-run-name>
4

Debug until the evaluation run is clean

Fix loading, dataset, grader, dependency, and reward issues before a training run. A passing evaluation run is the handoff point from creation to training run readiness.
5

Prepare a training run

Once the rollout is validated, let the agent inspect the training run config and run submit-time preflight. Submit only when you are ready to start a platform training run.
Do not skip the evaluation run gate. osmosis train submit should be the step after the rollout cleanly loads, runs, and grades samples on the platform.

Workspace Skills

The workspace skills are organized around rollout creation stages:
SkillPurpose
plan-trainingTurn a task idea or dataset into a concrete experiment plan
create-rolloutsCreate or adapt rollout code, graders, entrypoints, and baseline evaluation configs
evaluate-rolloutsSubmit evaluation runs, compare baselines, and inspect failures
debug-rolloutsDiagnose evaluation, config, dataset, dependency, or preflight failures
submit-trainingPrepare a training run config, submit a training run, and check training run status
You usually do not need to invoke these skills by name. Describe the outcome you want, and the agent should apply the right stage.

Next Steps

Rollout Overview

Understand the AgentWorkflow and Grader contract behind generated rollout code.

Evaluation

Validate rollouts with an evaluation run before submitting a training run.

Git Sync

Push rollout changes and let the platform sync the code version used for evaluation runs and training runs.

Training Runs

Submit and monitor a training run after the evaluation run passes.