Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.osmosis.ai/llms.txt

Use this file to discover all available pages before exploring further.

When you submit a training run with osmosis train submit, Osmosis provisions and runs your rollout for you — you don’t pick or configure a backend. This page is an SDK-level reference for the open source osmosis-ai package, useful when you want to drive rollouts yourself (e.g. custom local harnesses or self-hosted experimentation).
An execution backend determines where an AgentWorkflow and Grader run when you orchestrate them through the SDK directly. You can execute everything in the current Python process for simplicity, or isolate each execution inside a Docker container.

ExecutionBackend Interface

All backends implement the ExecutionBackend abstract class:
class ExecutionBackend(ABC):
    async def execute(
        self,
        request: ExecutionRequest,
        on_workflow_complete: ResultCallback,
        on_grader_complete: ResultCallback | None = None,
    ) -> None: ...

    @property
    def max_concurrency(self) -> int: ...  # 0 = no limit

    def health(self) -> dict[str, Any]: ...  # default: {"status": "ok"}
MethodDescription
execute()Runs an AgentWorkflow (and optionally a Grader) for a single request
max_concurrencyMaximum parallel executions. 0 means no limit.
health()Returns backend health status. Default implementation returns {"status": "ok"}.

LocalBackend

Executes your AgentWorkflow and Grader directly in the current Python process. No additional infrastructure required.
from osmosis_ai.rollout.backend import LocalBackend

backend = LocalBackend(
    workflow=MyWorkflow,
    grader=MyGrader,
)
Constructor parameters:
ParameterTypeDescription
workflowclass or "module:attr" stringYour AgentWorkflow subclass or a colon-separated import path (e.g. "my_module:MyWorkflow")
workflow_config`AgentWorkflowConfigNone`Optional config passed to the workflow
graderclass or "module:attr" stringYour Grader subclass or a colon-separated import path
grader_config`GraderConfigNone`Optional config passed to the grader
Execution flow:
1

Receive request

The backend receives an ExecutionRequest containing the input prompt and reference answer for one dataset row.
2

Run workflow

Creates an AgentWorkflowContext, establishes a RolloutContext, and calls workflow.run(ctx).
3

Collect samples

Collects all RolloutSample objects from the RolloutContext after the workflow completes.
4

Run grader

Creates a GraderContext with the collected samples and that row’s reference answer, then calls grader.grade(ctx).
5

Return results

Returns an ExecutionResult containing the samples and their assigned rewards.
Concurrency is controlled via AgentWorkflowConfig.concurrency.max_concurrent (default: 4). The backend uses a ConcurrencyLimiter to cap parallel workflow executions. Error categorization: The LocalBackend maps exceptions to structured error types — TimeoutError becomes TIMEOUT, ValueError/TypeError become VALIDATION_ERROR, and all other exceptions become AGENT_ERROR.

HarborBackend

Executes your AgentWorkflow and Grader inside Docker containers via the harbor package. Each execution gets its own isolated container with an independent filesystem and process space.
from osmosis_ai.rollout.backend import HarborBackend

backend = HarborBackend(
    orchestrator=orchestrator,
    task_dir="/path/to/task",
    user_code_dir="/path/to/code",
    workflow=MyWorkflow,
    grader=MyGrader,
)
Key characteristics:
  • Pre-builds a Docker image at construction time
  • Each execution runs in a fresh container
  • HarborAgentWorkflowContext extends AgentWorkflowContext with an environment object for container interaction — environment.exec(), environment.upload_file(), etc.
HarborBackend shells out to Docker when it is constructed, so Docker must be installed and running on whichever host instantiates it. If you are submitting a training run through osmosis train submit, this does not apply — the platform manages execution for you.

Comparison

LocalBackendHarborBackend
Execution environmentCurrent Python processDocker container
IsolationNone — shares process memoryFull — isolated filesystem and process
Startup speedInstantRequires Docker image build
Dependency managementShared Python environmentIndependent per container
DebuggingDirect — breakpoints, print statementsContainer logs, trace files
Best forDevelopment, eval, simple deploymentsProduction, complex agents

Choosing a Backend

When driving the SDK directly, start with LocalBackend — it has no infrastructure dependencies and is easiest to debug. Reach for HarborBackend only when you need filesystem/process isolation or want each rollout to manage its own Python dependencies.
osmosis eval run uses LocalBackend internally. If you embed osmosis-ai in your own harness, you can wire up either backend yourself.

Next Steps

Local Evaluation

Evaluate your rollout locally with osmosis eval run, or use it as a pre-training smoke test.

Strands Integration

Use the AWS Strands agent framework with Osmosis for training.