Agent Tool-Calling Examples¶

This section provides complete, runnable examples for generating agent tool-calling datasets. These examples demonstrate both programmatic and configuration-driven approaches to creating training data that teaches models systematic tool usage and reasoning.

Overview¶

Agent tool-calling datasets train models to: - Reason systematically about tool selection - Construct parameters accurately from user input - Explain their thinking at each step - Synthesize results into helpful responses - Maintain context across conversation turns

Quick Start Example¶

Basic Agent Tool-Calling¶

The simplest way to get started with agent tool-calling datasets:

Configuration File (agent-quickstart.yaml):

dataset_system_prompt: "You are an AI assistant that explains your reasoning when using tools."

topic_tree:
  topic_prompt: "Daily tasks requiring tool usage: weather, calculations, searches"
  provider: "openai"
  model: "gpt-4o-mini"
  depth: 2
  degree: 3

data_engine:
  generation_system_prompt: "Focus on WHY tools are selected and HOW parameters are constructed."
  provider: "openai"
  model: "gpt-4o-mini"
  conversation_type: "agent_cot_tools"
  available_tools: ["get_weather", "calculator", "search_web"]
  max_tools_per_query: 2

dataset:
  creation:
    num_steps: 10
    batch_size: 2
  save_as: "agent_quickstart.jsonl"

Generate the dataset:

deepfabric start agent-quickstart.yaml

Expected output structure:

{
  "question": "What's the weather in Tokyo and what's 20% of the temperature?",
  "available_tools": [...],
  "initial_analysis": "User wants weather data for Tokyo and a percentage calculation.",
  "tool_planning": [
    {
      "step_number": 1,
      "reasoning": "Need current weather for Tokyo to get temperature",
      "selected_tool": {...},
      "parameter_reasoning": {"location": "Tokyo specified by user"},
      "expected_result": "Current weather conditions including temperature"
    }
  ],
  "tool_executions": [...],
  "result_synthesis": "Combined weather data with percentage calculation",
  "final_answer": "Tokyo is currently 25°C with sunny skies. 20% of 25°C is 5°C."
}

Complete Examples¶

1. Professional Scenarios with Custom Tools¶

Use case: Business productivity scenarios with booking and analysis tools.

Custom tools definition (business_tools.yaml):

href="#__codelineno-3-1">tools: - name: "book_meeting_room" description: "Reserve a meeting room" parameters: - name: "room" type: "str" description: "Room name or number" required: true - name: "duration" type: "int" description: "Duration in minutes" required: true - name: "date" type: "str" description: "Date in YYYY-MM-DD format" required: true returns: "Meeting room reservation confirmation" category: "productivity" - name: "analyze_sales_data" description: "Analyze sales performance metrics" parameters: - name: "period" type: "str" description: "Time period (week, month, quarter)" required: true - name: "region" type: "str" description: "Geographic region" required: false default: "all" returns: "Sales analysis report with key metrics" category: "analytics"

Configuration (business_agent.yaml):

dataset_system_prompt: |
  You are a business productivity AI assistant with access to meeting, scheduling,
  and analytics tools. Always explain your reasoning when selecting tools.

topic_tree:
  topic_prompt: "Business productivity scenarios: meeting coordination, sales analysis, resource planning"
  provider: "openai"
  model: "gpt-4o-mini"
  temperature: 0.7
  depth: 3
  degree: 3

data_engine:
  generation_system_prompt: |
    Create realistic business scenarios requiring systematic tool usage.
    Focus on professional workflows and decision-making processes.

  provider: "openai"
  model: "gpt-4o"
  temperature: 0.8
  conversation_type: "agent_cot_tools"

  # Mix default and custom tools
  available_tools:
    - "get_weather"
    - "calculator"
    - "book_meeting_room"
    - "analyze_sales_data"

  tool_registry_path: "business_tools.yaml"
  max_tools_per_query: 3

dataset:
  creation:
    num_steps: 25
    batch_size: 5
    sys_msg: false
  save_as: "business_agent_dataset.jsonl"

  # Apply formatters for different training formats
  formatters:
    - name: "tool_calling"
      template: "builtin://tool_calling"
      output: "business_agent_formatted.jsonl"
      config:
        system_prompt: "You are a business productivity function calling AI."
        include_tools_in_system: true

Generate:

deepfabric start business_agent.yaml

2. Multi-Turn Conversational Agent¶

Use case: Progressive information gathering and context-aware tool usage.

Configuration (conversational_agent.yaml):

dataset_system_prompt: "You are a conversational AI that maintains context and uses tools progressively."

topic_tree:
  topic_prompt: "Multi-step assistance scenarios: travel planning, event coordination, research tasks"
  provider: "openai"
  model: "gpt-4o-mini"
  depth: 3
  degree: 4

data_engine:
  generation_system_prompt: |
    Create realistic multi-turn conversations where the agent gathers information
    progressively and adapts tool usage based on user responses.

  provider: "openai"
  model: "gpt-4o"
  conversation_type: "agent_cot_multi_turn"

  available_tools:
    - "get_weather"
    - "search_web"
    - "book_restaurant"
    - "calculator"
    - "get_directions"

  max_tools_per_query: 2
  temperature: 0.8

dataset:
  creation:
    num_steps: 15
    batch_size: 3
    sys_msg: true  # Multi-turn supports system messages
  save_as: "conversational_agent.jsonl"

Expected multi-turn output:

{
  "messages": [
    {"role": "user", "content": "I need help planning a dinner"},
    {"role": "assistant", "content": "I'd be happy to help! What type of cuisine are you interested in, and how many people?"},
    {"role": "user", "content": "Italian food for 4 people tomorrow in Boston"},
    {"role": "assistant", "content": "Great! Let me search for Italian restaurants in Boston..."},
    {"role": "user", "content": "What's the weather like? Should we consider outdoor seating?"}
  ],
  "tool_planning_trace": [
    {
      "step_number": 1,
      "reasoning": "User wants restaurant help but gave incomplete info - need more details",
      "selected_tool": null
    },
    {
      "step_number": 2,
      "reasoning": "Now have location, cuisine, party size - can search for restaurants",
      "selected_tool": {...}
    },
    {
      "step_number": 3,
      "reasoning": "User asking about weather for outdoor seating decision",
      "selected_tool": {...}
    }
  ],
  "tool_execution_trace": [...],
  "reasoning_summary": "Progressive information gathering leading to restaurant recommendation with weather consideration"
}

3. Programmatic Generation with Custom Logic¶

Use case: Full programmatic control with custom validation and filtering.

Python script (advanced_agent_generation.py):

import asyncio
from deepfabric import DataSetGenerator
from deepfabric.dataset import Dataset
from deepfabric.tree import Tree
from deepfabric.schemas import ToolDefinition, ToolParameter

async def generate_advanced_agent_dataset():
    """Generate agent dataset with custom tools and validation."""

    # Define domain-specific custom tools
    fitness_tool = ToolDefinition(
        name="create_workout_plan",
        description="Generate personalized workout plan",
        parameters=[
            ToolParameter(
                name="fitness_level",
                type="str",
                description="beginner/intermediate/advanced",
                required=True
            ),
            ToolParameter(
                name="goals",
                type="list",
                description="List of fitness goals",
                required=True
            ),
            ToolParameter(
                name="duration",
                type="int",
                description="Workout duration in minutes",
                required=False,
                default=30
            )
        ],
        returns="Personalized workout plan with exercises",
        category="fitness"
    )

    nutrition_tool = ToolDefinition(
        name="analyze_nutrition",
        description="Analyze nutritional content of foods",
        parameters=[
            ToolParameter(
                name="foods",
                type="list",
                description="List of foods to analyze",
                required=True
            ),
            ToolParameter(
                name="portion_sizes",
                type="list",
                description="Portion sizes for each food",
                required=False,
                default=[]
            )
        ],
        returns="Nutritional analysis with calories and macros",
        category="health"
    )

    # Create topic tree for fitness domain
    tree = Tree(
        topic_prompt="Fitness and wellness scenarios: workout planning, nutrition analysis, health tracking",
        provider="openai",
        model_name="gpt-4o-mini",
        degree=4,
        depth=3,
        temperature=0.7
    )

    topics = await tree.generate()
    print(f"Generated {len(topics)} fitness topics")

    # Create generator with custom tools
    generator = DataSetGenerator(
        generation_system_prompt="You are a fitness and wellness AI with specialized tools for workout and nutrition planning.",
        provider="openai",
        model_name="gpt-4o",
        conversation_type="agent_cot_tools",
        available_tools=[
            "calculator",
            "search_web",
            "create_workout_plan",
            "analyze_nutrition"
        ],
        custom_tools=[
            fitness_tool.model_dump(),
            nutrition_tool.model_dump()
        ],
        max_tools_per_query=3,
        temperature=0.8,
        topics=topics
    )

    # Generate samples with validation
    valid_samples = []
    total_attempts = 0
    max_attempts = 50

    while len(valid_samples) < 20 and total_attempts < max_attempts:
        batch_samples = await generator.generate()

        for sample in batch_samples:
            if validate_fitness_sample(sample):
                valid_samples.append(sample)
                print(f"Valid sample {len(valid_samples)}: {sample['question'][:50]}...")

        total_attempts += 1

    # Create final dataset
    dataset = Dataset.from_list(valid_samples)
    dataset.save("fitness_agent_dataset.jsonl")

    # Apply fitness-specific formatting
    formatter_config = {
        "name": "fitness_tool_calling",
        "template": "builtin://tool_calling",
        "output": "fitness_agent_formatted.jsonl",
        "config": {
            "system_prompt": "You are a fitness AI with workout and nutrition tools.",
            "include_tools_in_system": True
        }
    }

    formatted = dataset.apply_formatters([formatter_config])

    print(f"Generated {len(dataset)} validated fitness agent samples")
    return dataset

def validate_fitness_sample(sample):
    """Validate fitness-specific sample quality."""
    # Check tool usage is fitness-related
    tool_names = [exec["function"] for exec in sample.get("tool_executions", [])]
    fitness_tools = ["create_workout_plan", "analyze_nutrition", "calculator"]

    has_fitness_tool = any(tool in fitness_tools for tool in tool_names)
    has_reasoning = len(sample.get("tool_planning", [])) > 0
    has_question = "fitness" in sample.get("question", "").lower() or "workout" in sample.get("question", "").lower()

    return has_fitness_tool and has_reasoning and has_question

# Run the generation
if __name__ == "__main__":
    dataset = asyncio.run(generate_advanced_agent_dataset())

Output Format Examples¶

Tool-Calling Format Output¶

When using the builtin://tool_calling formatter, samples are converted to function calling format:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a function calling AI model. You have access to the following functions:\n\n<tools>\nget_weather(location: str) → Weather data\ncalculator(expression: str) → Calculation result\n</tools>"
    },
    {
      "role": "user",
      "content": "What's the weather in Paris and what's 15% of the temperature?"
    },
    {
      "role": "assistant",
      "content": "<think>User wants weather for Paris and a percentage calculation. I'll get the weather first, then calculate 15% of the temperature.</think>\n\n<tool_call>\n{\"name\": \"get_weather\", \"arguments\": {\"location\": \"Paris\"}}\n</tool_call>"
    },
    {
      "role": "tool",
      "content": "<tool_response>\nParis: 18°C, partly cloudy, 65% humidity\n</tool_response>"
    },
    {
      "role": "assistant",
      "content": "<tool_call>\n{\"name\": \"calculator\", \"arguments\": {\"expression\": \"18 * 0.15\"}}\n</tool_call>"
    },
    {
      "role": "tool",
      "content": "<tool_response>\n2.7\n</tool_response>"
    },
    {
      "role": "assistant",
      "content": "The current weather in Paris is 18°C with partly cloudy conditions and 65% humidity. 15% of the current temperature (18°C) equals 2.7°C."
    }
  ]
}

Best Practices¶

Topic Selection¶

Be specific about tool usage requirements in topic prompts
Include domain context that naturally requires tools
Mix complexity levels for varied training scenarios

Tool Configuration¶

Start with defaults then add domain-specific custom tools
Limit tools per query to maintain focus (2-4 tools maximum)
Group related tools by category for better organization

Quality Control¶

Validate tool usage - ensure samples actually use tools meaningfully
Check reasoning quality - tool planning should be logical and detailed
Review parameter construction - arguments should be based on user input

Production Considerations¶

Use cost-effective models (gpt-4o-mini) for large datasets
Batch processing for efficiency
Incremental validation to catch issues early
Multiple formatters for different training needs

These examples provide complete, production-ready templates for generating high-quality agent tool-calling datasets that effectively train models in systematic tool usage and reasoning.