Most users do not configure execution backends from the CLI.
osmosis eval submit and osmosis train submit both hand execution to the Osmosis platform; the rollout entrypoint constructs the backend on the server side. This page is an SDK-level guide for users embedding the open source osmosis-ai package in custom harnesses or self-hosted experiments.AgentWorkflow and Grader run when you orchestrate rollouts through the SDK directly.
| Backend | Runs where | Best for |
|---|---|---|
LocalBackend | Current Python process | Fast local development, custom eval harnesses, debugging |
HarborBackend | Harbor-managed trial environments | Per-trial isolation, task environments, dependency separation |
Backend Responsibilities
Every backend has the same core responsibilities:Receive an execution request
The request contains the input prompt for one dataset row and, if grading is enabled, the reference label.
Run the workflow
The backend creates an
AgentWorkflowContext, installs a RolloutContext, and calls AgentWorkflow.run(ctx).Collect samples
Agent integrations such as
OsmosisStrandsAgent and OsmosisAgent register sample sources on the rollout context. The backend collects those samples after the workflow finishes.Run the grader
If a grader and label are available, the backend creates a
GraderContext and calls Grader.grade(ctx).ExecutionBackend Interface
All backends implement this shape:| Method | Description |
|---|---|
execute() | Runs an AgentWorkflow and optionally a Grader for one request |
max_concurrency | Maximum parallel executions. 0 means no limit. |
health() | Returns backend health information |
LocalBackend
LocalBackend executes your workflow and grader directly in the current Python process.
| Parameter | Type | Description |
|---|---|---|
workflow | class or "module:attr" string | AgentWorkflow subclass or import path |
workflow_config | AgentWorkflowConfig | None | Optional config passed to the workflow |
grader | class or "module:attr" string | Optional Grader subclass or import path |
grader_config | GraderConfig | None | Optional config passed to the grader |
LocalBackend when:
- You want the shortest debug loop.
- You need breakpoints, stack traces, or simple print debugging.
- Your rollout can share the current Python environment.
- You are building a custom harness or eval runner around the SDK.
LocalBackend uses AgentWorkflowConfig.concurrency.max_concurrent to limit parallel workflow executions. If no workflow config is provided, the default concurrency is 4.
Local Error Categories
LocalBackend maps workflow and grader exceptions into structured categories:
| Exception | Category |
|---|---|
TimeoutError | TIMEOUT |
ValueError, TypeError, AssertionError | VALIDATION_ERROR |
| Other exceptions | AGENT_ERROR |
HarborBackend
HarborBackend runs workflows inside Harbor-managed trial environments. For platform-facing Harbor rollouts, use the Daytona-backed path used by the workspace template; the managed platform does not currently support Docker-backed Harbor execution.
| Parameter | Description |
|---|---|
orchestrator | Harbor TrialQueue used to run trials |
task_dir | Task directory used to build the Harbor environment |
user_code_dir | Directory containing rollout code to copy into the Harbor workspace |
workflow | AgentWorkflow class or import path |
workflow_config | Optional workflow config class/object import path |
grader | Optional Grader class or import path |
grader_config | Optional grader config class/object import path |
environment_config | Optional Harbor environment configuration; platform-facing Harbor rollouts should use Daytona |
prebuild_local_image | Local Docker runtime option for SDK experiments; not supported by the managed platform |
cleanup_successful_trials | Whether to remove successful trial artifacts after completion |
Harbor Runtime Notes
HarborBackend runs the workflow inside the Harbor-managed trial environment, but the workflow still receives the standard AgentWorkflowContext:
ctx.environment object in rollout code. If your workflow needs files, tools, or processes inside the trial environment, package them in the Harbor task environment and access them through normal Python code from within run().
Comparison
| Dimension | LocalBackend | HarborBackend |
|---|---|---|
| Execution environment | Current Python process | Harbor trial environment |
| Isolation | Shared process and filesystem | Isolated process and filesystem per trial |
| Startup cost | Minimal | Prepares Harbor task environment |
| Dependencies | Current Python environment | Harbor task environment |
| Debugging | Direct debugger, stack traces, print output | Harbor logs and trial artifacts |
| Typical user | SDK harness author, eval tooling | Self-hosted experiments needing isolation |
Choosing a Backend
Reach forHarborBackend when:
- Untrusted or messy rollout code should not share the host process.
- Tools write files or spawn processes that should be isolated per trial.
- You need a reproducible task environment per rollout execution.
- You are experimenting with Harbor outside the managed Osmosis training path.
Relationship to Eval and Training
osmosis eval submit and osmosis train submit both run the rollout server-side and do not expose these SDK backends as user-facing options. The entrypoint constructs the backend on the platform when the rollout server starts. These SDK-level backends only matter when you embed osmosis-ai in your own harness.
Next Steps
Evaluation
Submit an evaluation run before a training run.
Building AgentWorkflows
Learn how workflows register samples through supported integrations.
Strands Integration
Use Strands Agents inside an Osmosis workflow.
OpenAI Agents Integration
Use the OpenAI Agents SDK inside an Osmosis workflow.