Custom Graders
Write your own grading functions to implement custom evaluation logic.
Basic Structure
Section titled “Basic Structure”from letta_evals.decorators import graderfrom letta_evals.models import GradeResult, Sample
@graderdef my_custom_grader(sample: Sample, submission: str) -> GradeResult: """Custom grading logic."""
# Your evaluation logic score = calculate_score(submission, sample.ground_truth)
# Ensure score is between 0.0 and 1.0 score = max(0.0, min(1.0, score))
return GradeResult( score=score, rationale=f"Score based on custom logic: {score}" )Example: JSON Validation
Section titled “Example: JSON Validation”import jsonfrom letta_evals.decorators import graderfrom letta_evals.models import GradeResult, Sample
@graderdef valid_json(sample: Sample, submission: str) -> GradeResult: """Check if submission is valid JSON.""" try: json.loads(submission) return GradeResult(score=1.0, rationale="Valid JSON") except json.JSONDecodeError as e: return GradeResult(score=0.0, rationale=f"Invalid JSON: {e}")Registration
Section titled “Registration”Custom graders are automatically registered when you import them in your suite’s setup script or custom evaluators file.
Configuration
Section titled “Configuration”graders: my_metric: kind: tool function: my_custom_grader # Your function name extractor: last_assistantNext Steps
Section titled “Next Steps”- Tool Graders - Built-in grading functions
- Graders Concept - Understanding graders
- Example Custom Graders - See examples in the letta-evals repo