Core Concepts
| Concept | Description |
|---|---|
| Workspace | Top-level container for your team. Holds projects, members, API keys, and integrations. |
| Project | A training context within a workspace. Contains training runs, reward functions, tools, and datasets. |
| Training Run | A single RL training session. Configured with a base model, dataset, reward functions, and tools. |
| Reward Function | Python function that scores LLM outputs deterministically. Returns a float. |
| Reward Rubric | Natural language criteria evaluated by an LLM judge during training. |
| MCP Tools | Functions the agent can call during rollouts (calculators, search, code execution, etc.). |
| Rollout Server | An HTTP server you build that implements your custom agent loop. The training cluster sends rollout requests to your server, which orchestrates tool calls and LLM inference, then returns trajectories and rewards. Built with the Python SDK’s RolloutAgentLoop API. |
| Checkpoint | A saved model state during training. Can be merged and exported to Hugging Face. |
Architecture
Osmosis supports two rollout modes. Choose one depending on where you want the agent loop to run during training.Option A: Local Rollout
Agent loop runs inside the training cluster. Push reward functions, rubrics, and MCP tools to GitHub — the platform syncs and runs everything. No server to deploy.Option B: Remote Rollout
Agent loop runs on your own server. You only need to implement two functions —get_tools() (define available tools) and run() (agent loop logic) — by subclassing RolloutAgentLoop. The Python SDK automatically creates a trainer-compatible HTTP server that handles all protocol details (/v1/rollout/init, /v1/chat/completions, /v1/rollout/completed). Full control over agent logic and tool execution.
Feature Summary
Training Runs
Configure and launch RL training with custom models, datasets, and rewards.
Monitoring
Real-time metrics, training logs, and checkpoint management.
GitHub Integration
Connect repositories for automatic sync of tools, rewards, and rubrics.
Reward Rubrics
Configure LLM providers for reward rubric evaluation.
Workspace & Team
Manage members, API keys, and integrations.
Getting Started
Sign up, create a project, and launch your first training run.