Documentation Index
Fetch the complete documentation index at: https://docs.osmosis.ai/llms.txt
Use this file to discover all available pages before exploring further.
Datasets
Datasets provide the training prompts and ground truth that drive RL training. Each row in a dataset becomes a training example that the model learns from.Dataset Format
Osmosis accepts datasets in JSONL, CSV, or Parquet format, up to 5 GB per file.Required Columns
| Column | Description |
|---|---|
system_prompt | The system prompt provided to the model for this example. |
user_prompt | The user prompt or question the model must respond to. |
ground_truth | The expected correct answer or reference output. The platform UI also accepts label as an alias for this column. |
Optional Columns
| Column | Description |
|---|---|
metadata | Arbitrary JSON metadata attached to each example. Accessible in your Grader via extra_info. |
Example JSONL
Uploading a Dataset
| Status | Description |
|---|---|
| pending | Upload received, waiting to be processed. |
| processing | Dataset is being validated and indexed. |
| uploaded | Dataset is ready for use in training runs. |
| error | Processing failed — check column names and file format. |
| cancelled | Upload was cancelled before processing completed. |
Validating Locally
Before uploading, validate your dataset locally to catch format issues early:Previewing a Dataset
Preview the first few rows of an uploaded dataset:Managing Datasets
Models
Supported Base Models
Osmosis uses models imported from Hugging Face as the starting point for training. We currently support:| Model | Description |
|---|---|
Qwen/Qwen3.5-35B-A3B | Qwen 3.5 35B with 3B active parameters (MoE) |
Qwen/Qwen3.5-122B-A10B | Qwen 3.5 122B with 10B active parameters (MoE) |
The list of supported models is expanding. Check the platform dashboard or run
osmosis model list for the latest available models.Model Management
Private Models
To use private models from Hugging Face, configure your Hugging Face access token on the Secrets page in your workspace settings. This allows the platform to pull gated or private models during training. See Monitoring & Settings for details on managing secrets.Next Steps
Training Runs
Submit and manage training runs using your datasets and models.
Monitoring & Settings
Track training progress and configure workspace settings.