最佳实践

遵循以下建议以保持 Osmosis 同步仓库的可靠性；如遇问题，请参阅故障排除部分。

测试

编写单元测试

为你的函数创建全面的测试：

# tests/test_reward_functions.py
import pytest
from reward_fn.compute_reward import numbers_match_reward

def test_exact_match():
    """Test exact numerical match"""
    score = numbers_match_reward("#### 42", "42")
    assert score == 1.0

def test_close_match():
    """Test near-match within epsilon"""
    score = numbers_match_reward("#### 42.0000001", "42")
    assert score == 1.0

def test_mismatch():
    """Test completely different values"""
    score = numbers_match_reward("#### 100", "42")
    assert score == 0.0

def test_invalid_format():
    """Test handling of invalid input format"""
    score = numbers_match_reward("no number here", "42")
    assert score == 0.0

def test_missing_solution():
    """Test handling of empty solution"""
    score = numbers_match_reward("", "42")
    assert score == 0.0

@pytest.mark.parametrize("solution,ground_truth,expected", [
    ("#### 1", "1", 1.0),
    ("#### 0", "0", 1.0),
    ("#### -5", "-5", 1.0),
    ("#### 3.14159", "3.14159", 1.0),
])
def test_various_numbers(solution, ground_truth, expected):
    """Test various number formats"""
    score = numbers_match_reward(solution, ground_truth)
    assert score == expected

CI/CD 集成

GitHub Actions 工作流

创建 .github/workflows/test.yml：

name: Test and Validate

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -e .
          pip install pytest pytest-cov

      - name: Run tests
        run: |
          pytest tests/ -v --cov=. --cov-report=term-missing

      - name: Lint code
        run: |
          pip install ruff
          ruff check .

      - name: Type check
        run: |
          pip install mypy
          mypy mcp/ reward_fn/ reward_rubric/

      - name: Test MCP server
        run: |
          python mcp/main.py &
          sleep 5
          python mcp/test/test.py
          pkill -f "python mcp/main.py"

监控与调试

添加日志记录

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@osmosis_reward
def logged_reward(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    """Reward function with logging"""
    logger.info(f"Evaluating solution: {solution_str[:50]}...")

    try:
        score = compute_score(solution_str, ground_truth)
        logger.info(f"Computed score: {score}")
        return score
    except Exception as e:
        logger.error(f"Error computing score: {e}")
        return 0.0

故障排除

奖励函数问题

问题：奖励函数返回意外的分数 解决方案：

使用样本输入在本地测试
添加打印语句或日志记录
验证输入格式是否符合预期
检查错误处理是否覆盖了所有边界情况
确保返回类型为 float

Rubric 评估问题

问题：Rubric 评分不一致或出现错误 解决方案：

验证 API key 设置正确
检查 API key 是否有足够的额度/配额
先用更简单的 rubric 进行测试
在 evaluate_rubric 调用周围添加错误处理
使用 return_details=True 查看评估推理过程
验证模型名称对于提供商是否正确

导入错误

问题：ModuleNotFoundError 或导入失败 解决方案：

确保所有目录都有 __init__.py 文件
验证导入使用了正确的路径
检查依赖是否已安装：pip install -e .
从包根目录使用绝对导入
验证虚拟环境已激活

本地 Rollout

测试

编写单元测试

CI/CD 集成

GitHub Actions 工作流

监控与调试

添加日志记录

故障排除

奖励函数问题

Rubric 评估问题

导入错误

下一步

示例仓库

Python SDK

本地 Rollout

​测试

​编写单元测试

​CI/CD 集成

​GitHub Actions 工作流

​监控与调试

​添加日志记录

​故障排除

​奖励函数问题

​Rubric 评估问题

​导入错误

​下一步

示例仓库

Python SDK

测试

编写单元测试

CI/CD 集成

GitHub Actions 工作流

监控与调试

添加日志记录

故障排除

奖励函数问题

Rubric 评估问题

导入错误

下一步