Create Your Own Rollout

Use this path when you already have a task, dataset, or existing agent code and want an AI coding agent to help turn it into a runnable Osmosis rollout. If you are new to Osmosis and want the shortest copy-paste path, start with Run the Multiply Example instead.

This guide assumes you have completed Onboarding: your workspace repository is cloned, the CLI is installed and authenticated, and your AI coding environment is open in the workspace directory.

What Workspace Skills Do

Platform-created workspace repositories include project-local Agent Skills under .agents/skills/. Agents that support the open Agent Skills format can use those skills to move through the same loop an experienced Osmosis user would follow:

plan from dataset -> create rollout -> submit evaluation run -> debug failures -> prepare training run

It is not a replacement for the CLI. The agent still uses the Osmosis CLI as the source of truth for workspace checks, dataset validation, evaluation runs, and training run preflight.

When to Use This Path

Use this path when	Run the multiply example when
You already know the task you want to train on	You want proof that the platform works end to end
You have sample data or a platform dataset	You do not want to design a dataset yet
You want the agent to create or adapt rollout code	You want to copy commands without making product choices
You are comfortable inspecting generated code and evaluation run output	You are still learning the Osmosis workflow

Use the Skills in Your Workspace Repository

Open your platform-created workspace repository in your AI coding environment. The repository includes the workspace contract and Agent Skills alongside rollout code, configs, and data:

repository/
├── .agents/
│   └── skills/
├── .claude/
│   └── skills/
├── rollouts/
├── configs/
│   ├── eval/
│   └── training/
├── data/
├── AGENTS.md
├── CLAUDE.md
└── pyproject.toml

AGENTS.md contains the always-loaded workspace contract. .agents/skills/ contains the canonical workflow skills, and .claude/skills/<skill-name> exposes the same skills to Claude Code through symlinks back to .agents/skills/.

Start in a Workspace Repository

The skills assume this repository layout for source files:

repository/
├── rollouts/
├── configs/
│   ├── eval/
│   └── training/
├── data/
└── pyproject.toml

Before asking the agent to write rollout code, confirm that the CLI can resolve the workspace:

osmosis doctor
osmosis auth whoami

Ask the Agent to Plan from the Dataset

Start by describing your task and asking the agent to begin with the workspace’s planning skill. A useful first prompt is:

I want to train a model for <task> in this Osmosis workspace. Start with the `plan-training` skill: read the workspace instructions, help me settle the dataset plan, and propose the next step before creating rollouts, running evaluation runs, or submitting a training run.

The workspace skills should guide the agent to:

Plan training

Inspect data/, existing rollouts, and workspace config. The agent should settle the dataset schema before writing rollout code.

Create or adapt a rollout

Write the smallest AgentWorkflow and Grader that can load, run, and score samples. Generated files should stay under rollouts/, configs/eval/, configs/training/, and data/.

Submit an evaluation run

Push the rollout to the workspace repository and submit an evaluation run as the quality gate:

git push
osmosis eval submit configs/eval/<name>.toml
osmosis eval info <eval-run-name>

Debug until the evaluation run is clean

Fix loading, dataset, grader, dependency, and reward issues before a training run. A passing evaluation run is the handoff point from creation to training run readiness.

Prepare a training run

Once the rollout is validated, let the agent inspect the training run config and run submit-time preflight. Submit only when you are ready to start a platform training run.

Do not skip the evaluation run gate. osmosis train submit should be the step after the rollout cleanly loads, runs, and grades samples on the platform.

Workspace Skills

The workspace skills are organized around rollout creation stages:

Skill	Purpose
`plan-training`	Turn a task idea or dataset into a concrete experiment plan
`create-rollouts`	Create or adapt rollout code, graders, entrypoints, and baseline evaluation configs
`evaluate-rollouts`	Submit evaluation runs, compare baselines, and inspect failures
`debug-rollouts`	Diagnose evaluation, config, dataset, dependency, or preflight failures
`submit-training`	Prepare a training run config, submit a training run, and check training run status

You usually do not need to invoke these skills by name. Describe the outcome you want, and the agent should apply the right stage.

Next Steps

Rollout Overview

Understand the AgentWorkflow and Grader contract behind generated rollout code.

Evaluation

Validate rollouts with an evaluation run before submitting a training run.

Git Sync

Push rollout changes and let the platform sync the code version used for evaluation runs and training runs.

Training Runs

Submit and monitor a training run after the evaluation run passes.

​What Workspace Skills Do

​When to Use This Path

​Use the Skills in Your Workspace Repository

​Start in a Workspace Repository

​Ask the Agent to Plan from the Dataset

​Workspace Skills

​Next Steps

Rollout Overview

Evaluation

Git Sync

Training Runs

What Workspace Skills Do

When to Use This Path

Use the Skills in Your Workspace Repository

Start in a Workspace Repository

Ask the Agent to Plan from the Dataset

Workspace Skills

Next Steps