Skip to main content
When Local Rollout’s MCP-based workflow isn’t flexible enough, Remote Rollout gives you full control. You host an HTTP server that implements your agent logic — multi-step reasoning, custom environments, external APIs — while the training cluster handles LLM inference and trajectory collection.
Prerequisites: You need an Osmosis Platform account and the SDK installed (pip install osmosis-ai[server]). Authenticate with osmosis login, then register your rollout server URL on platform.osmosis.ai when ready to train.

What is Remote Rollout?

Remote Rollout separates agent trajectory generation from training infrastructure:

Training Cluster

Hosts LLM inference (/v1/chat/completions) and receives rollout results (/v1/rollout/completed)

RolloutServer (Your Code)

Implements agent loop with tools, delegates protocol handling to the SDK
This architecture allows you to:
  • Define custom tools - Implement any tools your agent needs (calculators, web search, code execution, etc.)
  • Control agent logic - Build sophisticated agent loops with custom reasoning
  • Collect training data - Automatically gather trajectories for reinforcement learning
  • Scale independently - Run multiple agent servers without modifying training infrastructure

How It Works

Protocol Flow

  1. Init Rollout (/v1/rollout/init):
    • Training cluster sends initial messages and parameters
    • SDK calls your get_tools() method
    • Returns 202 Accepted with InitResponse containing the tools list
    • This tells the training cluster what tools are available for this rollout
  2. Agent Loop (run() method):
    • Your code alternates between LLM calls and tool execution
    • Uses ctx.chat() to call the training cluster’s LLM endpoint
  3. Complete (/v1/rollout/completed):
    • Send final trajectory back to training
    • Include reward only if platform is configured to compute reward in remote rollout

Example Repository

We provide a complete example repository you can use as a starting point:

osmosis-remote-rollout-example

Full calculator agent example with tools, rewards, and test datasets
The example includes:
  • Complete agent loop implementation
  • Tool definitions and execution
  • Example reward computation logic
  • Test dataset in JSONL format
  • CLI usage examples
Ready to build your own agent? Follow the Quick Start guide to get up and running in 5 minutes.

Next Steps

Quick Start

Build your first agent in 5 minutes

Agent Loop Guide

Deep dive into RolloutAgentLoop implementation