AgentWorkflow and Grader are the two core abstractions in the Osmosis SDK for reinforcement learning training. Together they define a rollout — the unit of agent behavior and evaluation that the training cluster executes on every training step.Documentation Index
Fetch the complete documentation index at: https://docs.osmosis.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Training Loop
RL training on Osmosis follows a four-step loop:Prompt
The training cluster selects one row from your dataset and sends its input prompt to your AgentWorkflow. In most cases, that row just contains
system_prompt, user_prompt, and ground_truth.Rollout
Your AgentWorkflow processes the prompt — calling LLMs, using tools, executing multi-step reasoning — and produces output messages.
Grading
Your Grader evaluates the AgentWorkflow’s output against the row’s reference answer (
ground_truth, exposed as ctx.label) and assigns a numerical reward (typically 0.0 to 1.0).Core Abstractions
| Abstraction | Purpose | Base Class |
|---|---|---|
| AgentWorkflow | Defines agent behavior — how the model processes prompts, calls tools, and produces output | AgentWorkflow |
| Grader | Defines evaluation logic — how agent outputs are scored to produce reward signals | Grader |
rollouts/ directory as Python classes that you subclass and implement.
Quick Example
The SDK automatically discovers your
AgentWorkflow and Grader subclasses from the entrypoint file. No registration or decorators are needed — just define exactly one AgentWorkflow subclass and zero or one Grader subclass in your module.From Code to Training
Once you’ve written a rollout, the path to a live training run is three simple steps:Evaluate locally with osmosis eval run
Run your rollout against a local dataset using your own LLM API key and check that rewards, pass rates, and agent traces look right. Cap the dataset with
--limit N when you just want a quick smoke test. See Local Evaluation.Commit and sync
Commit your
rollouts/<name>/ directory and push to the default branch of your connected GitHub repo. The Osmosis platform picks up the change through Git Sync — this synced copy is what osmosis train submit actually runs, so uncommitted local edits are not included.Submit a training run
Run
osmosis train submit with a training TOML that points at your rollout and entrypoint. The platform provisions GPUs, deploys your rollout from the synced commit, and runs RL training for you. Pin a specific revision with commit_sha if you need reproducibility.Next Steps
Building AgentWorkflows
Implement the AgentWorkflow class to define your agent behavior.
Building Graders
Implement the Grader class to define reward signals for training.
Local Evaluation
Evaluate your rollout with
osmosis eval run — or use it as a smoke test before submitting a training run.Training Runs
Submit a training run once your rollout passes local eval.