Reasoning Styles Guide¶
DeepFabric's Chain of Thought implementation supports three distinct reasoning styles that optimize the generated reasoning patterns for different domains and problem types. Understanding when and how to use each style is crucial for creating high-quality CoT datasets.
Overview of Reasoning Styles¶
Style | Best For | Characteristics | Example Domains |
---|---|---|---|
Mathematical | Numerical problems | Step-by-step calculations, verification | Math, physics, chemistry |
Logical | Structured reasoning | Premise-conclusion chains, formal logic | CS algorithms, proofs, analysis |
General | Mixed domains | Flexible, adaptive reasoning | General Q&A, interdisciplinary problems |
Mathematical Reasoning Style¶
When to Use¶
- Primary use cases: Math word problems, physics calculations, financial problems
- Problem characteristics: Involve numerical computation, measurements, quantitative analysis
- Learning objectives: Teaching systematic calculation methods and verification
Reasoning Characteristics¶
Mathematical reasoning emphasizes: - Sequential calculations: Each step builds on the previous - Explicit arithmetic: Shows mathematical operations clearly - Unit tracking: Maintains dimensional analysis - Verification steps: Double-checks calculations when possible - Precision focus: Exact values and proper rounding
Example: Mathematical Style¶
{
"question": "A rectangular garden is 15 feet long and 8 feet wide. If fencing costs $4.50 per foot, how much will it cost to fence the entire perimeter?",
"chain_of_thought": "To find the cost of fencing, I need to calculate the perimeter first. For a rectangle, perimeter = 2 × (length + width). Length = 15 feet, width = 8 feet. Perimeter = 2 × (15 + 8) = 2 × 23 = 46 feet. Now I can calculate the total cost: Cost = perimeter × price per foot = 46 feet × $4.50/foot = $207.00. Let me verify: 46 × $4.50 = $207.00 ✓",
"final_answer": "$207.00"
}
Key features of mathematical style:
- Shows formula: perimeter = 2 × (length + width)
- Explicit substitution: 2 × (15 + 8)
- Step-by-step calculation: 2 × 23 = 46
- Unit tracking: 46 feet × $4.50/foot
- Verification: 46 × $4.50 = $207.00 ✓
Configuration for Mathematical Style¶
# Optimized for mathematical reasoning
data_engine:
conversation_type: "cot_freetext"
reasoning_style: "mathematical"
temperature: 0.2 # Lower for calculation consistency
generation_system_prompt: "You are a mathematics tutor who shows detailed step-by-step calculations with proper mathematical notation and verification."
# Python API configuration
generator = DataSetGenerator(
generation_system_prompt="You are a math teacher who demonstrates careful calculation steps with verification.",
conversation_type="cot_freetext",
reasoning_style="mathematical",
temperature=0.2 # Consistent calculations
)
Logical Reasoning Style¶
When to Use¶
- Primary use cases: Algorithm analysis, logical proofs, decision trees, systematic analysis
- Problem characteristics: Require formal reasoning, rule application, structured argumentation
- Learning objectives: Teaching systematic thinking and logical progression
Reasoning Characteristics¶
Logical reasoning emphasizes: - Premise identification: Clear statement of given conditions - Rule application: Explicit use of logical principles - Systematic progression: Following formal reasoning patterns - Conclusion validation: Ensuring logical consistency - Structure awareness: Organizing thoughts hierarchically
Example: Logical Style¶
{
"question": "In a binary search tree, if we want to find the minimum element, what is the optimal strategy and what is the time complexity?",
"chain_of_thought": "To find the minimum element in a binary search tree, I need to apply the BST property: for any node, all elements in the left subtree are smaller than the node's value. Given this property, the minimum element must be the leftmost node in the tree. The optimal strategy is: 1) Start at the root node, 2) While the current node has a left child, move to the left child, 3) When we reach a node with no left child, that node contains the minimum value. For time complexity analysis: In each step, we move one level down the tree, and we only traverse the left path. The maximum number of steps equals the height of the tree. Therefore, time complexity is O(h) where h is the height of the tree, which is O(log n) for a balanced BST and O(n) for a degenerate tree.",
"final_answer": "Follow the left path from root to the leftmost node. Time complexity: O(h) where h is tree height."
}
Key features of logical style: - Property identification: "BST property: all elements in left subtree are smaller" - Systematic steps: "1) Start at root, 2) Move left, 3) Stop when no left child" - Logical deduction: "minimum element must be the leftmost node" - Complexity analysis: "O(h) where h is height" - Formal reasoning structure
Configuration for Logical Style¶
# Optimized for logical reasoning
data_engine:
conversation_type: "cot_structured" # Often pairs well with structured format
reasoning_style: "logical"
temperature: 0.3 # Balanced consistency and flexibility
generation_system_prompt: "You are a computer science instructor who demonstrates systematic logical reasoning with clear premise-conclusion chains."
# Python API with logical reasoning
generator = DataSetGenerator(
generation_system_prompt="You are a CS professor who teaches systematic problem-solving with formal logical reasoning.",
conversation_type="cot_structured",
reasoning_style="logical",
temperature=0.3
)
General Reasoning Style¶
When to Use¶
- Primary use cases: Interdisciplinary problems, creative problem-solving, general Q&A
- Problem characteristics: Mixed domains, require adaptive thinking, no single dominant reasoning pattern
- Learning objectives: Teaching flexible thinking and domain adaptation
Reasoning Characteristics¶
General reasoning emphasizes: - Adaptive approach: Changes reasoning style based on problem type - Contextual awareness: Considers domain-specific factors - Flexible structure: Not bound to mathematical or logical templates - Natural flow: Follows intuitive reasoning patterns - Integrated thinking: Combines multiple reasoning approaches
Example: General Style¶
{
"question": "A company is deciding between two marketing strategies: Strategy A costs $50,000 upfront with a 70% success rate, while Strategy B costs $80,000 upfront with a 90% success rate. Both strategies, if successful, generate $200,000 in revenue. Which strategy should the company choose?",
"chain_of_thought": "This is a decision problem that involves both mathematical calculation and business judgment. Let me calculate the expected value for each strategy. Strategy A: Expected revenue = $200,000 × 0.70 = $140,000. Net expected value = $140,000 - $50,000 = $90,000. Strategy B: Expected revenue = $200,000 × 0.90 = $180,000. Net expected value = $180,000 - $80,000 = $100,000. From a purely mathematical perspective, Strategy B has a higher expected value ($100,000 vs $90,000). However, we should also consider risk tolerance and cash flow. Strategy A requires less upfront investment and might be better for a cash-constrained company. Strategy B has higher expected returns but also higher upfront risk. The choice depends on the company's risk profile and available capital.",
"final_answer": "Strategy B has higher expected value ($100,000 vs $90,000), but the choice depends on the company's risk tolerance and cash flow situation."
}
Key features of general style: - Multi-faceted analysis: combines math and business considerations - Adaptive reasoning: switches between calculation and qualitative analysis - Context awareness: considers company's financial situation - Balanced conclusion: acknowledges multiple factors - Natural language flow: not constrained by formal templates
Configuration for General Style¶
# Optimized for general reasoning
data_engine:
conversation_type: "cot_hybrid" # Often works well with hybrid format
reasoning_style: "general"
temperature: 0.4 # Higher creativity for adaptive thinking
generation_system_prompt: "You are an expert who adapts your reasoning approach to the specific problem, combining mathematical, logical, and intuitive thinking as needed."
# Python API with general reasoning
generator = DataSetGenerator(
generation_system_prompt="You are a problem-solving expert who uses the most appropriate reasoning approach for each unique situation.",
conversation_type="cot_hybrid",
reasoning_style="general",
temperature=0.4 # More flexibility
)
Comparative Analysis¶
Style Selection Decision Tree¶
Is your problem primarily...
├─ Numerical/Quantitative?
│ ├─ Calculations, measurements, formulas
│ └─ → Use MATHEMATICAL style
│
├─ Logical/Structural?
│ ├─ Algorithms, proofs, systematic analysis
│ └─ → Use LOGICAL style
│
└─ Mixed/Interdisciplinary?
├─ Multiple domains, creative problem-solving
└─ → Use GENERAL style
Advanced Usage Patterns¶
Mixing Styles Within a Dataset¶
For comprehensive datasets, you can generate samples with different reasoning styles:
def create_mixed_style_dataset():
"""Create dataset with multiple reasoning styles."""
from deepfabric import DataSetGenerator
from deepfabric.tree import Tree
# Define style-specific configurations
styles = {
"mathematical": {
"topic_prompt": "Mathematical word problems requiring calculations",
"reasoning_style": "mathematical",
"temperature": 0.2,
"samples": 20
},
"logical": {
"topic_prompt": "Algorithm and logical reasoning problems",
"reasoning_style": "logical",
"temperature": 0.3,
"samples": 15
},
"general": {
"topic_prompt": "Interdisciplinary problems requiring adaptive thinking",
"reasoning_style": "general",
"temperature": 0.4,
"samples": 15
}
}
all_samples = []
for style_name, config in styles.items():
print(f"Generating {style_name} reasoning samples...")
# Create style-specific topic tree
tree = Tree(
topic_prompt=config["topic_prompt"],
provider="openai",
model_name="gpt-4o-mini",
degree=3,
depth=2,
temperature=config["temperature"]
)
for event in tree.build():
if event['event'] == 'build_complete':
print(f" Topics: {event['total_paths']}")
# Generate with style-specific settings
generator = DataSetGenerator(
provider="openai",
model_name="gpt-4o-mini",
conversation_type="cot_freetext",
reasoning_style=config["reasoning_style"],
temperature=config["temperature"]
)
dataset = generator.create_data(
num_steps=config["samples"],
batch_size=1,
topic_model=tree,
sys_msg=False
)
# Tag samples with reasoning style
for sample in dataset.samples:
sample["reasoning_style"] = style_name
all_samples.extend(dataset.samples)
print(f" Generated {len(dataset.samples)} samples")
# Save combined dataset
from deepfabric.dataset import Dataset
combined = Dataset()
combined.samples = all_samples
combined.save("mixed_reasoning_styles.jsonl")
print(f"\nTotal samples: {len(all_samples)}")
# Analyze style distribution
from collections import Counter
style_counts = Counter(sample["reasoning_style"] for sample in all_samples)
for style, count in style_counts.items():
print(f" {style}: {count} samples")
# Usage
create_mixed_style_dataset()
Dynamic Style Selection¶
For advanced use cases, you can implement dynamic style selection based on problem characteristics:
def select_reasoning_style(problem_text: str) -> str:
"""Automatically select reasoning style based on problem content."""
problem_lower = problem_text.lower()
# Mathematical indicators
math_keywords = ['calculate', 'cost', 'price', 'area', 'volume', 'distance', 'speed', 'time', 'percent', 'fraction']
math_symbols = any(char in problem_text for char in '+-×÷=<>%$')
if any(keyword in problem_lower for keyword in math_keywords) or math_symbols:
return "mathematical"
# Logical indicators
logic_keywords = ['algorithm', 'if', 'then', 'prove', 'logic', 'tree', 'graph', 'sort', 'search', 'complexity']
cs_terms = ['array', 'list', 'node', 'function', 'recursive', 'iterate']
if any(keyword in problem_lower for keyword in logic_keywords + cs_terms):
return "logical"
# Default to general for mixed or unclear domains
return "general"
# Example usage
problems = [
"Calculate the area of a rectangle with length 5m and width 3m", # → mathematical
"Explain how binary search works and analyze its time complexity", # → logical
"Should a company invest in renewable energy given environmental and economic factors?" # → general
]
for problem in problems:
style = select_reasoning_style(problem)
print(f"Problem: {problem[:50]}...")
print(f"Recommended style: {style}\n")
Style-Specific Prompt Engineering¶
Mathematical Style Prompts¶
# Enhanced prompts for mathematical reasoning
mathematical_prompts = {
"generation_system_prompt": """You are a mathematics teacher who demonstrates problems with:
- Clear step-by-step calculations
- Proper mathematical notation
- Unit tracking and dimensional analysis
- Verification steps when possible
- Exact arithmetic with explanations""",
"instructions": """Create mathematical word problems that require:
- Multi-step numerical calculations
- Real-world applications of mathematical concepts
- Clear quantitative relationships
- Opportunities for verification and checking"""
}
Logical Style Prompts¶
# Enhanced prompts for logical reasoning
logical_prompts = {
"generation_system_prompt": """You are a computer science instructor who demonstrates:
- Systematic logical progression
- Clear premise-conclusion relationships
- Formal reasoning patterns
- Structured analysis methods
- Rule-based decision making""",
"instructions": """Create problems that require:
- Systematic logical analysis
- Algorithm understanding and application
- Formal reasoning and proof techniques
- Structured problem decomposition"""
}
General Style Prompts¶
# Enhanced prompts for general reasoning
general_prompts = {
"generation_system_prompt": """You are an expert problem solver who:
- Adapts reasoning approach to the problem domain
- Combines multiple types of thinking (mathematical, logical, creative)
- Considers contextual factors and real-world constraints
- Uses natural, intuitive reasoning patterns
- Balances analytical and creative thinking""",
"instructions": """Create diverse problems that require:
- Adaptive thinking across different domains
- Integration of multiple perspectives
- Consideration of practical constraints
- Creative and flexible problem-solving approaches"""
}
Quality Evaluation by Style¶
Style-Specific Quality Metrics¶
def evaluate_reasoning_style_quality(sample: dict, expected_style: str) -> dict:
"""Evaluate how well a sample matches its intended reasoning style."""
reasoning = sample.get('chain_of_thought', '').lower()
scores = {
'mathematical': 0.0,
'logical': 0.0,
'general': 0.0
}
# Mathematical style indicators
math_indicators = [
any(char in reasoning for char in '+-×÷='), # Mathematical symbols
any(word in reasoning for word in ['calculate', 'multiply', 'divide', 'add', 'subtract']),
'verify' in reasoning or 'check' in reasoning, # Verification
any(word in reasoning for word in ['step', 'first', 'then', 'next']), # Sequential steps
]
scores['mathematical'] = sum(math_indicators) / len(math_indicators)
# Logical style indicators
logic_indicators = [
any(word in reasoning for word in ['therefore', 'because', 'since', 'given']), # Logical connectors
any(word in reasoning for word in ['property', 'rule', 'principle', 'definition']), # Formal concepts
any(word in reasoning for word in ['if', 'then', 'when', 'condition']), # Conditional logic
reasoning.count('.') >= 3, # Structured sentences
]
scores['logical'] = sum(logic_indicators) / len(logic_indicators)
# General style indicators
general_indicators = [
any(word in reasoning for word in ['consider', 'factor', 'aspect', 'perspective']), # Multi-faceted thinking
any(word in reasoning for word in ['however', 'but', 'although', 'while']), # Balanced reasoning
any(word in reasoning for word in ['context', 'situation', 'case', 'scenario']), # Contextual awareness
len(reasoning.split()) > 50, # Substantial explanation
]
scores['general'] = sum(general_indicators) / len(general_indicators)
return {
'scores': scores,
'predicted_style': max(scores, key=scores.get),
'confidence': max(scores.values()),
'matches_expected': max(scores, key=scores.get) == expected_style
}
# Usage
sample = {
'chain_of_thought': 'To calculate the area, I need to multiply length × width. Area = 5m × 3m = 15m². Let me verify: 5 × 3 = 15 ✓'
}
evaluation = evaluate_reasoning_style_quality(sample, 'mathematical')
print(f"Predicted style: {evaluation['predicted_style']}")
print(f"Confidence: {evaluation['confidence']:.2f}")
print(f"Matches expected: {evaluation['matches_expected']}")
Troubleshooting Style Issues¶
Common Problems and Solutions¶
Problem: Mathematical style not showing calculations¶
Symptoms: Reasoning lacks explicit arithmetic steps Solution: Adjust prompts and temperature
# More explicit mathematical prompting
generator = DataSetGenerator(
generation_system_prompt="You are a math teacher who ALWAYS shows every calculation step with arithmetic operations clearly visible (like 15 + 8 = 23).",
reasoning_style="mathematical",
temperature=0.1 # Very low for calculation consistency
)
Problem: Logical style too informal¶
Symptoms: Reasoning lacks formal structure and logical connectors Solution: Emphasize systematic reasoning
# More formal logical prompting
generator = DataSetGenerator(
generation_system_prompt="You are a computer science professor who demonstrates formal logical reasoning with clear premises, systematic analysis, and structured conclusions.",
reasoning_style="logical",
temperature=0.2
)
Problem: General style too vague¶
Symptoms: Reasoning lacks depth and specific insights Solution: Encourage multi-faceted analysis
# More comprehensive general reasoning
generator = DataSetGenerator(
generation_system_prompt="You are an expert analyst who examines problems from multiple angles, considering quantitative factors, logical relationships, and practical constraints.",
reasoning_style="general",
temperature=0.35
)
Conclusion¶
Reasoning styles are a powerful tool for creating domain-appropriate Chain of Thought datasets. By selecting the right style for your use case:
- Mathematical style excels at numerical problems requiring step-by-step calculations
- Logical style shines for algorithmic and systematic reasoning tasks
- General style adapts flexibly to interdisciplinary and creative problems
Understanding these distinctions and applying them appropriately will significantly improve the quality and effectiveness of your CoT datasets for training reasoning-capable language models.
Next Steps¶
- Experiment with styles: Try different styles on the same problem type to see the differences
- Mixed datasets: Create datasets combining multiple reasoning styles for comprehensive training
- Custom styles: Consider extending DeepFabric to support domain-specific reasoning patterns
- Evaluation: Develop metrics to assess reasoning quality within each style
- Model training: Fine-tune models on style-specific datasets and compare performance