RolloutAgentLoop base class and best practices for implementing custom agents.
Required vs Optional Features
Before diving in, here’s a quick summary of what’s required and what’s optional:Required
| Feature | Description |
|---|---|
name attribute | Unique identifier for your agent |
get_tools(request) method | Returns tools list when /v1/rollout/init is called. The training cluster needs this to know what tools are available. |
run(ctx) method | Executes your agent loop logic |
Optional
| Feature | Description |
|---|---|
ctx.log_event() | Debug logging - only writes if --log flag is enabled |
ctx.record_tool_call() | Metrics tracking for analytics |
get_last_assistant_content() | Helper function for logs/reward - not part of SDK |
compute_reward_from_messages() | Only required if platform is configured to compute reward in remote rollout |
RolloutAgentLoop Base Class
Every agent must inherit fromRolloutAgentLoop and implement two required methods:
Required Attributes
| Attribute | Type | Description |
|---|---|---|
name | string | Unique identifier for your agent |
Required Methods
| Method | Description |
|---|---|
get_tools(request) | Return list of tools in OpenAI function format. Called when /v1/rollout/init is received, the returned tools are included in the response to the training cluster. |
run(ctx) | Execute the agent loop and return result |
When the training cluster sends a request to
/v1/rollout/init, the SDK automatically calls your get_tools() method and returns the tools list in the InitResponse. This tells the training cluster what tools are available for this rollout.RolloutContext
Thectx parameter provides everything needed to run your agent:
Properties
| Property | Type | Description |
|---|---|---|
ctx.request | RolloutRequest | Original request with messages and params |
ctx.tools | list | Tools returned by get_tools() |
Methods
| Method | Description |
|---|---|
ctx.chat(messages, **kwargs) | Call the LLM |
ctx.complete(messages, finish_reason, reward) | Return successful result |
ctx.error(message) | Return error result |
ctx.record_tool_call(latency_ms) | Track tool execution metrics |
ctx.log_event(event_name, **data) | Log debug events (when logging enabled) |
RolloutRequest
The request contains everything from the training cluster:| Field | Type | Description |
|---|---|---|
messages | list | Initial conversation messages |
max_turns | int | Maximum agent turns allowed |
max_tokens_total | int | Token limit for entire rollout |
completion_params | dict | LLM parameters (temperature, etc.) |
metadata | dict | Custom metadata (e.g., ground_truth) |
Implementing Tools
Tool Schema Format
Tools use OpenAI’s function calling format:Tool Execution Helpers
The SDK provides utilities for executing tools:Parallel Tool Execution
Execute multiple tools concurrently:Complete Agent Example
Here’s a full agent implementation with multiple tools. Note the distinction between required and optional features:Server Configuration
create_app() Parameters
Thecreate_app() function accepts several optional parameters for fine-tuning server behavior:
| Parameter | Type | Default | Description |
|---|---|---|---|
agent_loop | RolloutAgentLoop | Required | Your agent loop implementation |
max_concurrent | int | None | None | Maximum concurrent rollouts (None = unlimited) |
record_ttl_seconds | float | None | 3600.0 | TTL for rollout records in seconds |
settings | RolloutSettings | None | None | Custom configuration settings |
debug_dir | str | None | None | Directory for debug logging |
on_startup | Callable[[], Awaitable[None]] | None | None | Async startup callback |
on_shutdown | Callable[[], Awaitable[None]] | None | None | Async shutdown callback |
RolloutSettings Configuration
For advanced configuration, useRolloutSettings with RolloutClientSettings:
Environment Variables
All client settings can be configured via environment variables:| Environment Variable | Description | Default |
|---|---|---|
OSMOSIS_ROLLOUT_CLIENT_TIMEOUT_SECONDS | HTTP timeout | 300.0 |
OSMOSIS_ROLLOUT_CLIENT_MAX_RETRIES | Max retries | 3 |
OSMOSIS_ROLLOUT_CLIENT_COMPLETE_ROLLOUT_RETRIES | Retries for /completed | 2 |
OSMOSIS_ROLLOUT_CLIENT_RETRY_BASE_DELAY | Initial retry delay | 1.0 |
OSMOSIS_ROLLOUT_CLIENT_RETRY_MAX_DELAY | Max retry delay | 30.0 |
OSMOSIS_ROLLOUT_CLIENT_MAX_CONNECTIONS | Connection pool size | 100 |
OSMOSIS_ROLLOUT_CLIENT_MAX_KEEPALIVE_CONNECTIONS | Keep-alive connections | 20 |
The
get_last_assistant_content helper function (seen in some examples) is purely optional - it’s only useful for logging or reward computation. It’s not required by the SDK.Handling Rewards
Reward computation in Remote Rollout is conditional - whether you need to return a reward depends on your platform configuration:| Platform Configuration | Reward Requirement |
|---|---|
| Reward computed on platform side | Return None - no reward needed |
| Reward computed in remote rollout | Must return a float value |
When Reward is Required
If your platform is configured to compute rewards in the remote rollout server, yourrun() method must return a reward value via ctx.complete():
When Reward is Optional
If reward computation is handled elsewhere (e.g., on the platform side), you can simply returnNone:
Using @osmosis_reward Decorator
For complex reward computation, use the@osmosis_reward decorator:
Always include
**kwargs in reward functions for platform compatibility.Helper Functions (Optional)
The following helper functions are purely optional - they’re useful for logging and reward computation but not required by the SDK:Extracting Solutions
Common pattern for extracting answers:Debug Logging
Usectx.log_event() to trace execution:
Error Handling
Return errors gracefully:Best Practices
Validate Early
Validate Early
Always run
osmosis validate before deploying to catch issues early.Handle Token Limits
Handle Token Limits
Check
ctx.request.max_tokens_total and break early if exceeded.Track Metrics
Track Metrics
Use
ctx.record_tool_call() to track tool execution for analytics.Test Locally
Test Locally
Use
osmosis test with --interactive to debug agent behavior.Log Strategically
Log Strategically
Use
ctx.log_event() at key points (pre-LLM, post-tool, completion).