Agent Tool-Calling Instruction Formats¶

Agent tool-calling datasets are designed for training language models to effectively reason about and use external tools. Unlike traditional conversation datasets, agent tool-calling formats capture the complete reasoning process: why tools are selected, how parameters are constructed, and what results mean in context.

Why Agent Tool-Calling Datasets?¶

Advanced Reasoning with Tools¶

Traditional tool-calling datasets often show final function calls without the reasoning process. Agent CoT datasets teach models to: - Analyze user intent and identify what information or actions are needed - Select appropriate tools from available options with clear reasoning - Construct parameters systematically based on user input and context - Interpret tool results and synthesize them into helpful responses - Chain multiple tools together for complex multi-step tasks

Enhanced Model Training for MCP Integration¶

Models trained on agent tool-calling data perform significantly better with: - Model Context Protocol (MCP) server integration - Function calling APIs across different providers - Multi-step workflows requiring tool chaining - Parameter validation and error handling - Tool selection from large tool catalogs

Structured vs Natural Language Approaches¶

Agent tool-calling datasets leverage structured generation with Pydantic schemas rather than relying solely on natural language prompts. This provides: - Type-safe tool definitions with automatic validation - Consistent parameter formats across different tools - Reusable tool libraries that can be shared across projects - Better training efficiency through structured reasoning traces

Agent CoT Format Types¶

DeepFabric supports two specialized agent tool-calling formats:

1. Single-Turn Agent CoT (`agent_cot_tools`)¶

Best for: Training models to handle complete tasks with tool usage in a single interaction.

{
  "question": "What's the weather like in Paris and how does it compare to New York?",
  "available_tools": [...],
  "initial_analysis": "I need to get weather data for two cities and compare them.",
  "tool_planning": [
    {
      "step_number": 1,
      "reasoning": "Need current weather for Paris",
      "selected_tool": {...},
      "parameter_reasoning": {"location": "Paris specified by user"},
      "expected_result": "Current weather conditions in Paris"
    },
    {
      "step_number": 2,
      "reasoning": "Need current weather for New York to compare",
      "selected_tool": {...},
      "parameter_reasoning": {"location": "New York for comparison"},
      "expected_result": "Current weather conditions in New York"
    }
  ],
  "tool_executions": [
    {
      "function": "get_weather",
      "arguments": {"location": "Paris"},
      "reasoning": "Getting Paris weather as requested",
      "result": "Paris: 18°C, partly cloudy, 60% humidity"
    },
    {
      "function": "get_weather",
      "arguments": {"location": "New York"},
      "reasoning": "Getting NYC weather for comparison",
      "result": "New York: 22°C, sunny, 45% humidity"
    }
  ],
  "result_synthesis": "Compared both cities' weather data to provide comprehensive answer",
  "final_answer": "Paris is currently 18°C and partly cloudy with 60% humidity, while New York is warmer at 22°C with sunny skies and lower humidity at 45%. New York has better weather today."
}

2. Multi-Turn Agent CoT (`agent_cot_multi_turn`)¶

Best for: Training conversational agents that maintain context and reasoning across multiple exchanges.

{
  "messages": [
    {"role": "user", "content": "What's the weather in Paris?"},
    {"role": "assistant", "content": "Let me check the current weather in Paris for you..."},
    {"role": "user", "content": "How about New York?"},
    {"role": "assistant", "content": "I'll get the New York weather to compare with Paris..."},
    {"role": "user", "content": "Which city is better for outdoor activities today?"}
  ],
  "tool_planning_trace": [...],
  "tool_execution_trace": [...],
  "reasoning_summary": "Progressive weather comparison leading to activity recommendation"
}

Key Differences from Traditional CoT¶

Aspect	Traditional CoT	Agent Tool-Calling CoT
Focus	Problem reasoning	Tool reasoning + problem solving
Structure	Natural language steps	Structured tool planning + execution
Tools	Hardcoded in prompts	Dynamic, user-definable schemas
Validation	Text-based	Type-safe with Pydantic
Reusability	Low (prompt-specific)	High (schema-based)
Training Efficiency	Moderate	Higher (structured traces)

Quick Start Guide¶

1. Choose Your Agent Format¶

Format	Use Case	Best For
Single-Turn	Complete task resolution	Tool reasoning, parameter construction
Multi-Turn	Conversational tool usage	Context maintenance, progressive reasoning

2. Basic Configuration¶

# agent-tool-calling.yaml
dataset_system_prompt: "You are an AI assistant with access to various tools. Always explain your reasoning when selecting and using tools."

topic_tree:
  topic_prompt: "Real-world scenarios requiring tool usage"
  provider: "openai"
  model: "gpt-4o-mini"
  degree: 3
  depth: 2

data_engine:
  generation_system_prompt: "You excel at reasoning about tool selection and parameter construction."
  provider: "openai"
  model: "gpt-4o-mini"
  conversation_type: "agent_cot_tools"  # or agent_cot_multi_turn
  available_tools: ["get_weather", "search_web", "calculator"]
  max_tools_per_query: 2

dataset:
  creation:
    num_steps: 10
    batch_size: 2
  save_as: "agent_tool_dataset.jsonl"

3. Generate Your Dataset¶

# Using CLI
deepfabric start agent-tool-calling.yaml

# Using Python
python examples/agent_tool_calling.py

What's Next?¶

Format Deep Dives¶

Single-Turn Agent CoT - Complete task resolution with tools
Multi-Turn Agent CoT - Conversational tool usage

Tutorials¶

Getting Started - First agent tool-calling dataset

Configuration¶

YAML Configuration - Agent-specific parameters
Python API - Programmatic usage

Agent tool-calling datasets bridge the gap between reasoning and action, creating training data that teaches models not just what to do, but how to think about doing it.