Multi-Turn Conversations
Multi-turn conversations allow you to test how agents handle context across multiple exchanges.
Why Use Multi-Turn?
Section titled “Why Use Multi-Turn?”Multi-turn conversations enable testing that single-turn prompts cannot:
- Memory storage: Verify agents persist information to memory blocks
- Tool call sequences: Test multi-step workflows
- Context retention: Ensure agents remember details from earlier
- State evolution: Track how agent state changes across interactions
- Conversational coherence: Test if agents maintain context appropriately
Format
Section titled “Format”Single-Turn (Default)
Section titled “Single-Turn (Default)”{ "input": "What is the capital of France?", "ground_truth": "Paris"}Multi-Turn
Section titled “Multi-Turn”{ "input": [ "My name is Alice", "What's my name?" ], "ground_truth": "Alice"}The agent processes each input in sequence, with state carrying over between turns.
Example 1: Memory Recall Testing
Section titled “Example 1: Memory Recall Testing”Test if the agent remembers information across turns:
{ "input": [ "Remember that my favorite color is blue", "What's my favorite color?" ], "ground_truth": "blue"}Suite configuration:
graders: response_check: kind: tool function: contains extractor: last_assistant # Check the agent's responseExample 2: Memory Correction Testing
Section titled “Example 2: Memory Correction Testing”Test if the agent correctly updates memory when users correct themselves:
{ "input": [ "Please remember that I like bananas.", "Actually, sorry, I meant I like apples." ], "ground_truth": "apples"}Suite configuration:
graders: memory_check: kind: tool function: contains extractor: memory_block extractor_config: block_label: human # Check the actual memory block, not just the responseWhen to Test Memory Blocks vs. Responses
Section titled “When to Test Memory Blocks vs. Responses”Use last_assistant or all_assistant extractors when:
- Testing what the agent says in conversation
- Verifying response content and phrasing
- Checking conversational coherence
Use memory_block extractor when:
- Verifying information was actually stored in memory
- Testing memory updates and corrections
- Validating persistent state changes
- Ensuring the agent’s internal state is correct
See the multiturn-memory-block-extractor example for a complete working implementation.
Next Steps
Section titled “Next Steps”- Datasets - Creating test datasets
- Extractors - Extracting from trajectories
- Targets - Agent lifecycle and testing behavior