Skip to main content
Follow these recommendations to keep your Osmosis-synced repository reliable, and refer to the troubleshooting section if something isn’t working as expected.

Testing

Write Unit Tests

Create comprehensive tests for your functions:
# tests/test_reward_functions.py
import pytest
from reward_fn.compute_reward import numbers_match_reward

def test_exact_match():
    """Test exact numerical match"""
    score = numbers_match_reward("#### 42", "42")
    assert score == 1.0

def test_close_match():
    """Test near-match within epsilon"""
    score = numbers_match_reward("#### 42.0000001", "42")
    assert score == 1.0

def test_mismatch():
    """Test completely different values"""
    score = numbers_match_reward("#### 100", "42")
    assert score == 0.0

def test_invalid_format():
    """Test handling of invalid input format"""
    score = numbers_match_reward("no number here", "42")
    assert score == 0.0

def test_missing_solution():
    """Test handling of empty solution"""
    score = numbers_match_reward("", "42")
    assert score == 0.0

@pytest.mark.parametrize("solution,ground_truth,expected", [
    ("#### 1", "1", 1.0),
    ("#### 0", "0", 1.0),
    ("#### -5", "-5", 1.0),
    ("#### 3.14159", "3.14159", 1.0),
])
def test_various_numbers(solution, ground_truth, expected):
    """Test various number formats"""
    score = numbers_match_reward(solution, ground_truth)
    assert score == expected

CI/CD Integration

GitHub Actions Workflow

Create .github/workflows/test.yml:
name: Test and Validate

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -e .
          pip install pytest pytest-cov

      - name: Run tests
        run: |
          pytest tests/ -v --cov=. --cov-report=term-missing

      - name: Lint code
        run: |
          pip install ruff
          ruff check .

      - name: Type check
        run: |
          pip install mypy
          mypy mcp/ reward_fn/ reward_rubric/

      - name: Test MCP server
        run: |
          python mcp/main.py &
          sleep 5
          python mcp/test/test.py
          pkill -f "python mcp/main.py"

Monitoring and Debugging

Add Logging

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@osmosis_reward
def logged_reward(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    """Reward function with logging"""
    logger.info(f"Evaluating solution: {solution_str[:50]}...")

    try:
        score = compute_score(solution_str, ground_truth)
        logger.info(f"Computed score: {score}")
        return score
    except Exception as e:
        logger.error(f"Error computing score: {e}")
        return 0.0

Troubleshooting

Reward Function Issues

Problem: Reward functions returning unexpected scores Solutions:
  • Test locally with sample inputs
  • Add print statements or logging
  • Verify input format matches expectations
  • Check error handling catches all edge cases
  • Ensure return type is float

Rubric Evaluation Issues

Problem: Rubric scores inconsistent or errors Solutions:
  • Verify API key is set correctly
  • Check API key has sufficient credits/quota
  • Test with simpler rubric first
  • Add error handling around evaluate_rubric call
  • Use return_details=True to see evaluation reasoning
  • Verify model name is correct for provider

Import Errors

Problem: ModuleNotFoundError or import failures Solutions:
  • Ensure all directories have __init__.py files
  • Verify imports use correct paths
  • Check dependencies are installed: pip install -e .
  • Use absolute imports from package root
  • Verify virtual environment is activated

Next Steps

Example Repository

Study the complete reference implementation

Python SDK

Learn more about the Python SDK