Skip to main content
Osmosis Platform is the central hub for managing reinforcement learning training of LLMs. It handles GPU provisioning, training orchestration, metrics collection, and model management — so you can focus on defining agent behavior and evaluation logic.

Core Concepts

ConceptDescription
WorkspaceTop-level container for your team. Holds projects, members, API keys, and integrations.
ProjectA training context within a workspace. Contains training runs, reward functions, tools, and datasets.
Training RunA single RL training session. Configured with a base model, dataset, reward functions, and tools.
Reward FunctionPython function that scores LLM outputs deterministically. Returns a float.
Reward RubricNatural language criteria evaluated by an LLM judge during training.
MCP ToolsFunctions the agent can call during rollouts (calculators, search, code execution, etc.).
Rollout ServerAn HTTP server you build that implements your custom agent loop. The training cluster sends rollout requests to your server, which orchestrates tool calls and LLM inference, then returns trajectories and rewards. Built with the Python SDK’s RolloutAgentLoop API.
CheckpointA saved model state during training. Can be merged and exported to Hugging Face.

Architecture

Osmosis supports two rollout modes. Choose one depending on where you want the agent loop to run during training.

Option A: Local Rollout

Agent loop runs inside the training cluster. Push reward functions, rubrics, and MCP tools to GitHub — the platform syncs and runs everything. No server to deploy.

Option B: Remote Rollout

Agent loop runs on your own server. You only need to implement two functions — get_tools() (define available tools) and run() (agent loop logic) — by subclassing RolloutAgentLoop. The Python SDK automatically creates a trainer-compatible HTTP server that handles all protocol details (/v1/rollout/init, /v1/chat/completions, /v1/rollout/completed). Full control over agent logic and tool execution.

Feature Summary