Multi-Turn Agent CoT (`agent_cot_multi_turn`)¶

Multi-turn agent Chain of Thought datasets train models to maintain context and reasoning across multiple conversation exchanges while using tools progressively. This format captures how intelligent agents build upon previous interactions and adapt their tool usage based on evolving user needs.

When to Use Multi-Turn Agent CoT¶

Multi-turn agent CoT is ideal for: - Conversational tool usage where context from previous turns influences current decisions - Progressive information gathering where each user response adds constraints or details - Complex workflows that naturally span multiple interactions - Context-aware assistance where the agent remembers previous tool results

Key Differences from Single-Turn¶

Aspect	Single-Turn	Multi-Turn
Scope	Complete task in one response	Evolving task across conversation
Context	Self-contained	Builds on previous exchanges
Tool Usage	Planned upfront	Adapts based on user feedback
Complexity	Linear reasoning	Non-linear, responsive reasoning

Format Structure¶

The AgentCoTMultiTurn schema captures the complete conversational flow:

{
  "messages": [
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."},
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."}
  ],
  "tool_planning_trace": [
    {reasoning steps across all turns}
  ],
  "tool_execution_trace": [
    {all tool executions throughout conversation}
  ],
  "reasoning_summary": "Overall strategy across the conversation"
}

Detailed Schema Breakdown¶

`messages` (list)¶

Complete conversation history showing natural flow between user and assistant.

Example:

"messages": [
  {"role": "user", "content": "What's the weather like today?"},
  {"role": "assistant", "content": "I'd be happy to check the weather for you! However, I need to know your location first. Which city would you like me to check?"},
  {"role": "user", "content": "I'm in San Francisco"},
  {"role": "assistant", "content": "Let me get the current weather for San Francisco..."},
  {"role": "user", "content": "Is it good for a picnic?"},
  {"role": "assistant", "content": "Based on the weather I just retrieved, let me give you a recommendation for picnic conditions..."}
]

`tool_planning_trace` (list)¶

Reasoning steps that span across conversation turns, showing how planning evolves:

"tool_planning_trace": [
  {
    "step_number": 1,
    "reasoning": "User wants weather but didn't specify location - need to ask for clarification",
    "selected_tool": null,
    "parameter_reasoning": {},
    "expected_result": "User will provide location information"
  },
  {
    "step_number": 2,
    "reasoning": "Now have location (San Francisco) - can get weather data",
    "selected_tool": {weather_tool_definition},
    "parameter_reasoning": {"location": "San Francisco provided by user in second message"},
    "expected_result": "Current weather conditions for San Francisco"
  },
  {
    "step_number": 3,
    "reasoning": "User asking about picnic suitability - need to interpret weather data for outdoor activity",
    "selected_tool": null,
    "parameter_reasoning": {},
    "expected_result": "Analysis of weather suitability for picnic activities"
  }
]

`tool_execution_trace` (list)¶

All tool calls made throughout the conversation with their context:

"tool_execution_trace": [
  {
    "function": "get_weather",
    "arguments": {"location": "San Francisco"},
    "reasoning": "Getting current weather for San Francisco as requested",
    "result": "San Francisco: 22°C, sunny, light breeze from west, 0% chance of rain",
    "turn_number": 2
  }
]

`reasoning_summary` (string)¶

High-level description of the reasoning strategy across the entire conversation:

Example: - "Progressive information gathering: started with incomplete query, gathered location through conversation, retrieved weather data, then interpreted results for specific use case (picnic suitability). Demonstrates context maintenance and adaptive tool usage."

Multi-Turn Patterns¶

Progressive Information Gathering¶

{
  "messages": [
    {"role": "user", "content": "Book me a restaurant reservation"},
    {"role": "assistant", "content": "I'd be happy to help with that! I'll need a few details: What type of cuisine do you prefer, how many people, what date and time, and which city?"},
    {"role": "user", "content": "Italian food for 2 people tomorrow at 7 PM in Boston"},
    {"role": "assistant", "content": "Perfect! Let me search for available Italian restaurants in Boston for tomorrow evening..."},
    {"role": "user", "content": "Something upscale but not too expensive"},
    {"role": "assistant", "content": "I'll refine the search to focus on mid-to-upscale Italian restaurants..."}
  ],
  "tool_planning_trace": [
    {
      "step_number": 1,
      "reasoning": "User wants restaurant reservation but provided incomplete information - need cuisine, party size, date/time, location",
      "selected_tool": null
    },
    {
      "step_number": 2,
      "reasoning": "Now have all basic info - can search for Italian restaurants in Boston",
      "selected_tool": {search_tool},
      "parameter_reasoning": {"query": "Italian restaurants Boston reservations tomorrow 7 PM"}
    },
    {
      "step_number": 3,
      "reasoning": "User added preference for upscale but not expensive - need to filter results",
      "selected_tool": {search_tool},
      "parameter_reasoning": {"query": "mid-range upscale Italian restaurants Boston", "price_range": "$$-$$$"}
    }
  ]
}

Context-Dependent Tool Usage¶

{
  "messages": [
    {"role": "user", "content": "What's 15% of 247?"},
    {"role": "assistant", "content": "Let me calculate that for you..."},
    {"role": "user", "content": "And what about 20%?"},
    {"role": "assistant", "content": "I'll calculate 20% of 247 as well..."},
    {"role": "user", "content": "Which one should I use for a tip?"}
  ],
  "tool_planning_trace": [
    {
      "step_number": 1,
      "reasoning": "Direct calculation request for 15% of 247",
      "selected_tool": {calculator_tool}
    },
    {
      "step_number": 2,
      "reasoning": "Follow-up calculation for 20% of same number (247) - context maintained",
      "selected_tool": {calculator_tool}
    },
    {
      "step_number": 3,
      "reasoning": "User asking for tip advice - need to provide guidance based on previous calculations",
      "selected_tool": null,
      "expected_result": "Tip recommendation based on context and social norms"
    }
  ]
}

{
  "messages": [
    {"role": "user", "content": "Find flights to Europe"},
    {"role": "assistant", "content": "I'd be happy to help find flights to Europe! Which specific city or country are you interested in, and what are your departure details?"},
    {"role": "user", "content": "Paris, leaving from New York next month"},
    {"role": "assistant", "content": "Let me search for flights from New York to Paris for next month..."},
    {"role": "user", "content": "Actually, I prefer direct flights under $800"},
    {"role": "assistant", "content": "I'll refine the search to show only direct flights under $800..."}
  ]
}

Configuration Example¶

YAML Configuration¶

dataset_system_prompt: "You are a conversational AI assistant that maintains context and uses tools progressively across multiple turns."

data_engine:
  generation_system_prompt: "Generate realistic multi-turn conversations showing progressive tool usage and context maintenance."
  provider: "openai"
  model: "gpt-4o-mini"
  conversation_type: "agent_cot_multi_turn"
  available_tools:
    - "get_weather"
    - "search_web"
    - "book_restaurant"
    - "calculator"
  max_tools_per_query: 2

dataset:
  creation:
    num_steps: 15
    batch_size: 3
    sys_msg: true  # Multi-turn supports system messages
  save_as: "multi_turn_agent.jsonl"

Python API¶

from deepfabric import DataSetGenerator

engine = DataSetGenerator(
    generation_system_prompt="Create multi-turn conversations with progressive tool usage.",
    provider="openai",
    model_name="gpt-4o-mini",
    conversation_type="agent_cot_multi_turn",
    available_tools=["get_weather", "search_web", "book_restaurant"],
    max_tools_per_query=2,
)

dataset = engine.create_data(
    num_steps=10,
    batch_size=2,
    sys_msg=True
)

Best Practices¶

Context Maintenance¶

Reference previous exchanges in reasoning steps
Build upon prior information rather than starting fresh
Acknowledge user clarifications explicitly

Progressive Tool Usage¶

Start with available information and identify gaps
Use follow-up questions to gather missing details
Adapt tool selection based on new user input

Natural Conversation Flow¶

Maintain conversational tone throughout exchanges
Ask clarifying questions when information is incomplete
Provide interim updates during multi-step processes

Training Benefits¶

Models trained on multi-turn agent CoT data demonstrate: - Better context retention across conversation turns - More natural information gathering through clarifying questions - Adaptive tool usage based on evolving user needs - Improved conversation management for complex workflows - Enhanced user experience through progressive assistance

Common Conversation Patterns¶

Information Gathering → Tool Usage → Follow-up¶

User makes vague request
Assistant asks for specifics
User provides details
Assistant uses tools with complete information
User asks follow-up questions
Assistant adapts based on tool results and context

Iterative Refinement¶

User makes broad request
Assistant provides initial results using tools
User refines requirements
Assistant adjusts tool usage and filters results
Process continues until user is satisfied

Context-Building Workflows¶

Each turn adds information to the context
Tool usage becomes more targeted with each exchange
Final responses synthesize all previous interactions
Complex workflows completed through natural conversation

This format creates training data for agents that can engage in sophisticated, context-aware conversations while using tools intelligently and progressively.

Multi-Turn Agent CoT (agent_cot_multi_turn)¶

When to Use Multi-Turn Agent CoT¶

Key Differences from Single-Turn¶

Format Structure¶

Detailed Schema Breakdown¶

messages (list)¶

tool_planning_trace (list)¶

tool_execution_trace (list)¶

reasoning_summary (string)¶

Multi-Turn Patterns¶

Progressive Information Gathering¶

Context-Dependent Tool Usage¶

Iterative Refinement¶

Configuration Example¶

YAML Configuration¶

Python API¶

Best Practices¶

Context Maintenance¶

Progressive Tool Usage¶

Natural Conversation Flow¶

Training Benefits¶

Common Conversation Patterns¶

Information Gathering → Tool Usage → Follow-up¶

Iterative Refinement¶

Context-Building Workflows¶

Multi-Turn Agent CoT (`agent_cot_multi_turn`)¶

`messages` (list)¶

`tool_planning_trace` (list)¶

`tool_execution_trace` (list)¶

`reasoning_summary` (string)¶