Skip to main content

API Reference

Complete reference for the osmosis-ai Python SDK. The SDK provides two main capabilities:

Reward & Evaluation

Decorators and functions for scoring LLM outputs using local rules or LLM-based rubrics

Remote Rollout

Build custom agent loops for training infrastructure integration

Reward API

Decorators

@osmosis_reward

Decorator for local reward functions that compute scores without API calls. Use this for deterministic evaluation logic (exact match, regex, keyword matching). Signature:
@osmosis_reward
def function_name(
    solution_str: str,
    ground_truth: str,
    extra_info: dict = None,
    **kwargs
) -> float
Parameters:
  • solution_str (str, required) - Text to evaluate
  • ground_truth (str, required) - Reference answer
  • extra_info (dict, optional) - Additional context
  • **kwargs (required) - Future compatibility (see warning below)
Returns: float - Score value Example:
from osmosis_ai import osmosis_reward

@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0

@osmosis_rubric

Decorator for LLM-based evaluation functions. Use this for subjective evaluation that requires semantic understanding (helpfulness, tone, quality). Signature:
@osmosis_rubric
def function_name(
    solution_str: str,
    ground_truth: str | None,
    extra_info: dict,
    **kwargs
) -> float
Parameters:
  • solution_str (str, required) - Text to evaluate
  • ground_truth (str | None, required) - Reference answer (can be None)
  • extra_info (dict, required) - Configuration and context
  • **kwargs (required) - Future compatibility (see warning below)
Returns: float - Score value Example:
from osmosis_ai import osmosis_rubric, evaluate_rubric

@osmosis_rubric
def quality_check(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
    return evaluate_rubric(
        rubric="Evaluate response quality",
        solution_str=solution_str,
        model_info={"provider": "openai", "model": "gpt-5"},
        ground_truth=ground_truth
    )

Rubric Evaluation

evaluate_rubric()

Evaluate text using an LLM-based rubric. This is the core function for LLM-powered evaluation. Signature:
def evaluate_rubric(
    rubric: str,
    solution_str: str,
    model_info: dict,
    ground_truth: str | None = None,
    original_input: str | None = None,
    metadata: dict | None = None,
    score_min: float = 0.0,
    score_max: float = 1.0,
    timeout: int | None = None,
    return_details: bool = False
) -> float | dict
Parameters:
ParameterTypeRequiredDescription
rubricstrYesNatural language evaluation criteria
solution_strstrYesText to evaluate
model_infodictYesLLM provider configuration
ground_truthstrNoReference answer
original_inputstrNoOriginal user query
metadatadictNoAdditional context
score_minfloatNoMinimum score (default: 0.0)
score_maxfloatNoMaximum score (default: 1.0)
timeoutintNoRequest timeout in seconds
return_detailsboolNoReturn full response (default: False)
model_info Structure:
{
    "provider": "openai",           # Required
    "model": "gpt-5",         # Required
    "api_key": "sk-...",            # Optional
    "api_key_env": "OPENAI_API_KEY", # Optional
    "timeout": 30                   # Optional
}
Returns:
  • float - Score (when return_details=False)
  • dict - Full response with score, explanation, raw payload (when return_details=True)
Example:
from osmosis_ai import evaluate_rubric

score = evaluate_rubric(
    rubric="Evaluate how helpful the response is.",
    solution_str="Click 'Forgot Password' to reset.",
    model_info={"provider": "openai", "model": "gpt-5"}
)

Exceptions

MissingAPIKeyError

Raised when an API key is not found for a provider.
from osmosis_ai import MissingAPIKeyError

try:
    score = evaluate_rubric(...)
except MissingAPIKeyError as e:
    print(f"API key not found: {e}")

ProviderRequestError

Raised when a provider request fails.
from osmosis_ai import ProviderRequestError

try:
    score = evaluate_rubric(...)
except ProviderRequestError as e:
    print(f"Provider error: {e}")

ModelNotFoundError

Raised when a specified model is not available (subclass of ProviderRequestError).
from osmosis_ai import ModelNotFoundError

try:
    score = evaluate_rubric(...)
except ModelNotFoundError as e:
    print(f"Model not found: {e}")

Types

ModelInfo (TypedDict)

from osmosis_ai.rubric_types import ModelInfo

model_info: ModelInfo = {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key_env": "OPENAI_API_KEY",
    "timeout": 30
}

RewardRubricRunResult (TypedDict)

Returned when return_details=True:
from osmosis_ai.rubric_types import RewardRubricRunResult

result: RewardRubricRunResult = {
    "score": 0.85,              # float
    "explanation": "...",       # str
    "raw": {...}                # Any - raw LLM response
}


Complete Reward Example

from osmosis_ai import osmosis_reward, osmosis_rubric, evaluate_rubric
from dotenv import load_dotenv

load_dotenv()

# Local reward function
@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0

# Remote rubric evaluator
@osmosis_rubric
def semantic_eval(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
    return evaluate_rubric(
        rubric="Compare semantic similarity (0-1 scale)",
        solution_str=solution_str,
        ground_truth=ground_truth,
        model_info={"provider": "openai", "model": "gpt-5"}
    )

# Usage
solution = "The capital of France is Paris"
truth = "Paris is France's capital"

local_score = exact_match(solution, truth)
semantic_score = semantic_eval(solution, truth, {})

print(f"Exact match: {local_score}")      # 0.0
print(f"Semantic: {semantic_score}")      # ~1.0

Remote Rollout API

Build custom agent loops that integrate with Osmosis training infrastructure. Your agent runs as an HTTP server while the training cluster handles LLM inference and trajectory collection.

Quick Overview

ComponentDescription
RolloutAgentLoopBase class for implementing agents
RolloutContextExecution context with chat(), complete(), error() methods
RolloutRequestInitial request with messages, parameters, and metadata
create_app()Factory function to create FastAPI server

Minimal Example

from osmosis_ai.rollout import RolloutAgentLoop, RolloutContext, RolloutResult, RolloutRequest, create_app

class MyAgent(RolloutAgentLoop):
    name = "my_agent"

    def get_tools(self, request: RolloutRequest) -> list:
        return [{"type": "function", "function": {"name": "search", ...}}]

    async def run(self, ctx: RolloutContext) -> RolloutResult:
        messages = list(ctx.request.messages)
        result = await ctx.chat(messages)
        messages.append(result.message)
        return ctx.complete(messages)

app = create_app(MyAgent())

Utility Modules

ModuleDescription
osmosis_ai.rollout.toolsTool execution helpers (get_tool_call_info, create_tool_result, etc.)
osmosis_ai.rollout.messagesMessage utilities (get_message_content, is_assistant_message, etc.)
osmosis_ai.rollout.networkNetwork utilities (detect_public_ip, validate_ipv4, etc.)

Remote Rollout Documentation

For complete Remote Rollout API documentation, guides, and examples, see the dedicated Remote Rollout section.

Next Steps