Routing and Parallelization Patterns for AI Agents

Nov 10 2025 AI agentic-ai

Sequential chains work well when tasks have a clear order, but real-world problems often require more flexibility. Sometimes you need to route tasks to different specialists based on their content. Other times, multiple agents should work simultaneously on different aspects of a problem. In this post, I’ll cover two powerful patterns: routing for intelligent task dispatch and parallelization for concurrent processing.

The Routing Pattern

Think of routing like a sophisticated mail sorting facility. Instead of one person handling every type of mail, specialized systems ensure each piece reaches the right destination quickly.

flowchart TD
    I[Input] --> R{Router}
    R -->|Type A| A1[Specialist A]
    R -->|Type B| A2[Specialist B]
    R -->|Type C| A3[Specialist C]
    A1 --> O[Output]
    A2 --> O
    A3 --> O

    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
    classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff

    class R orangeClass
    class A1 blueClass
    class A2 greenClass
    class A3 pinkClass

Why Route?

Routing offers four key benefits:

Task Specialization: Agents optimized for specific tasks perform better than generalists
Resource Optimization: Route simple tasks to faster/cheaper models, complex ones to powerful models
Flexibility: Handle diverse request types through a single entry point
Scalability: Add new specialists without restructuring the entire system

Two Stages of Routing

Every routing workflow has two core stages:

Stage 1: Classification

Analyze the incoming task to determine its type, category, or complexity.

Rule-based classification uses programmatic logic:

def classify_by_rules(message: str) -> str:
    message_lower = message.lower()

    if any(word in message_lower for word in ["refund", "return", "money back"]):
        return "billing"
    elif any(word in message_lower for word in ["broken", "not working", "bug"]):
        return "technical"
    elif any(word in message_lower for word in ["how to", "tutorial", "guide"]):
        return "documentation"
    else:
        return "general"

LLM-based classification handles nuance better:

def classify_with_llm(message: str) -> str:
    system_prompt = """
    Classify the customer message into one category:
    - billing: Payment, refunds, subscription issues
    - technical: Bugs, errors, technical problems
    - documentation: How-to questions, feature explanations
    - general: Everything else

    Respond with only the category name.
    """

    return call_llm(system_prompt, message, temperature=0)

LLM classification excels when:

Categories have fuzzy boundaries
Context matters for classification
New categories emerge over time

Stage 2: Task Dispatch

Direct the classified input to the appropriate specialist:

def dispatch_to_agent(category: str, message: str) -> str:
    agents = {
        "billing": billing_agent,
        "technical": technical_agent,
        "documentation": docs_agent,
        "general": general_agent
    }

    agent = agents.get(category, general_agent)
    return agent(message)

Complete Routing Implementation

from openai import OpenAI

client = OpenAI()

def call_llm(system_prompt: str, user_prompt: str, temperature: float = 0.7) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        temperature=temperature
    )
    return response.choices[0].message.content


def router_agent(query: str) -> str:
    """Classify and route to appropriate specialist"""

    # Stage 1: Classification
    classification_prompt = """
    You are a task router. Analyze the query and select the best agent:

    - ResearchAgent: Factual questions, information gathering
    - AnalysisAgent: Data interpretation, comparisons, evaluations
    - CreativeAgent: Writing, brainstorming, content creation

    Respond with only the agent name.
    """

    agent_choice = call_llm(classification_prompt, query, temperature=0)

    # Stage 2: Dispatch
    if "Research" in agent_choice:
        return research_agent(query)
    elif "Analysis" in agent_choice:
        return analysis_agent(query)
    elif "Creative" in agent_choice:
        return creative_agent(query)
    else:
        return general_agent(query)


def research_agent(query: str) -> str:
    system_prompt = """
    You are a research specialist. Provide factual, well-sourced
    information. Be thorough but concise.
    """
    return call_llm(system_prompt, query)


def analysis_agent(query: str) -> str:
    system_prompt = """
    You are a data analyst. Evaluate information critically,
    identify patterns, and provide structured insights.
    """
    return call_llm(system_prompt, query)


def creative_agent(query: str) -> str:
    system_prompt = """
    You are a creative specialist. Generate engaging, original
    content with a fresh perspective.
    """
    return call_llm(system_prompt, query, temperature=0.9)

Advanced: Routing with Orchestration

Sometimes the router needs to gather information from multiple sources before final dispatch:

flowchart TD
    Q[Query] --> R{Router}
    R -->|Pricing Query| P[Product Research]
    R -->|Pricing Query| C[Customer Analysis]
    P --> S[Pricing Strategist]
    C --> S
    S --> O[Response]

    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
    classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff

    class R orangeClass
    class P blueClass
    class C greenClass
    class S pinkClass

The router identifies that pricing queries need context from both product and customer data before the pricing specialist can respond.

The Parallelization Pattern

When subtasks don’t depend on each other, run them simultaneously. Parallelization follows a scatter-gather pattern: distribute work to multiple agents, then consolidate results.

flowchart TD
    I[Input] --> S[Scatter]
    S --> A1[Agent 1]
    S --> A2[Agent 2]
    S --> A3[Agent 3]
    A1 --> G[Gather]
    A2 --> G
    A3 --> G
    G --> O[Output]

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class S blueClass
    class G greenClass

The Golden Rule: Independence

Parallelization only works when subtasks are independent:

Agent A shouldn’t need to wait for Agent B’s output
If there’s a dependency, use sequential chaining instead

Task Decomposition Strategies

Three ways to split work for parallel processing:

1. Sectioning (Sharding)

Split large inputs into chunks processed simultaneously:

def parallel_summarize(long_document: str) -> str:
    # Split into sections
    sections = split_into_sections(long_document)

    # Summarize each in parallel
    summaries = parallel_execute([
        lambda s=section: summarize_agent(s)
        for section in sections
    ])

    # Combine
    return combine_summaries(summaries)

Use case: Summarizing a 100-page report by processing each chapter simultaneously.

2. Aspect-Based Decomposition

Different agents analyze different facets of the same input:

flowchart LR
    P[Product] --> T[Technical Specs]
    P --> S[Sentiment Analysis]
    P --> C[Competitive Pricing]
    T --> R[Combined Report]
    S --> R
    C --> R

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class T blueClass
    class S orangeClass
    class C greenClass

Use case: Product analysis where technical specs, customer sentiment, and pricing each require different expertise.

3. Diversity/Voting

Run the same task multiple times for reliability or creative variety:

def diverse_generation(prompt: str, num_variations: int = 3) -> list:
    """Generate multiple creative variations"""
    return parallel_execute([
        lambda: creative_agent(prompt, temperature=0.9)
        for _ in range(num_variations)
    ])


def majority_vote(question: str, num_voters: int = 5) -> str:
    """Get consensus answer through voting"""
    answers = parallel_execute([
        lambda: classifier_agent(question)
        for _ in range(num_voters)
    ])

    # Return most common answer
    return max(set(answers), key=answers.count)

Aggregation Strategies

After parallel execution, combine results:

Strategy	Description	Use Case
Concatenation	Join outputs together	Chapter summaries → document
Selection	Pick best output	Choose top creative option
Voting	Majority wins	Classification consensus
Synthesis	LLM combines intelligently	Multi-perspective report

Implementation with Threading

import threading
from typing import Dict, Any

# Shared storage for results
agent_outputs: Dict[str, Any] = {}


class PolicyAgent:
    def run(self, query: str):
        system_prompt = "You are a policy expert. Analyze regulatory implications."
        agent_outputs["policy"] = call_llm(system_prompt, query)


class TechnologyAgent:
    def run(self, query: str):
        system_prompt = "You are a technology expert. Analyze technical feasibility."
        agent_outputs["technology"] = call_llm(system_prompt, query)


class MarketAgent:
    def run(self, query: str):
        system_prompt = "You are a market analyst. Analyze market dynamics."
        agent_outputs["market"] = call_llm(system_prompt, query)


class SynthesizerAgent:
    def run(self, query: str, inputs: dict) -> str:
        system_prompt = """
        You are a senior analyst. Synthesize multiple perspectives
        into a comprehensive, coherent report.
        """

        combined_input = f"""
        Original Query: {query}

        Policy Analysis:
        {inputs['policy']}

        Technology Analysis:
        {inputs['technology']}

        Market Analysis:
        {inputs['market']}

        Provide an integrated analysis addressing all perspectives.
        """

        return call_llm(system_prompt, combined_input)


def analyze_parallel(query: str) -> str:
    """Run parallel analysis and synthesize"""

    # Create agents
    policy = PolicyAgent()
    tech = TechnologyAgent()
    market = MarketAgent()
    synthesizer = SynthesizerAgent()

    # Create threads
    threads = [
        threading.Thread(target=policy.run, args=(query,)),
        threading.Thread(target=tech.run, args=(query,)),
        threading.Thread(target=market.run, args=(query,)),
    ]

    # Start all threads
    for t in threads:
        t.start()

    # Wait for completion
    for t in threads:
        t.join()

    # Synthesize results
    return synthesizer.run(query, agent_outputs)


# Usage
result = analyze_parallel("What are the implications of AI regulation in healthcare?")

Real-World Example: Contract Analysis

A 50-page enterprise contract needs review. Sequential review by one expert takes too long.

Parallel approach:

flowchart TD
    C[Contract] --> L[Legal Terms
Checker]
    C --> CO[Compliance
Validator]
    C --> F[Financial Risk
Assessor]
    L --> S[Summary
Agent]
    CO --> S
    F --> S
    S --> R[Executive
Report]

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class L blueClass
    class CO orangeClass
    class F pinkClass
    class S greenClass

Three specialists work simultaneously:

Legal Terms Checker: Identifies problematic clauses
Compliance Validator: Checks regulatory requirements
Financial Risk Assessor: Evaluates financial exposure

A synthesizer combines their findings into a comprehensive executive summary.

Combining Routing and Parallelization

These patterns work well together. A router can trigger parallel workflows for complex queries:

def smart_router(query: str) -> str:
    category = classify_query(query)

    if category == "simple":
        return general_agent(query)

    elif category == "research":
        return research_agent(query)

    elif category == "complex_analysis":
        # Route to parallel analysis workflow
        return analyze_parallel(query)

Key Takeaways

Routing

Classify then dispatch: Two-stage process for intelligent task direction
LLM classification for nuance: Handles fuzzy boundaries better than rules
Specialist agents perform better: Focused prompts outperform generalists
Can orchestrate sub-workflows: Router may gather context before final dispatch

Parallelization

Independence is required: Subtasks must not depend on each other
Three decomposition strategies: Sectioning, aspect-based, diversity/voting
Four aggregation strategies: Concatenation, selection, voting, synthesis
Threading for true concurrency: Python threading enables simultaneous API calls

These patterns handle task variety (routing) and task complexity (parallelization). But what about tasks that require iterative improvement or dynamic planning? In the next post, I’ll explore evaluator-optimizer loops and orchestrator-worker patterns.

This is Part 6 of my series on building intelligent AI systems. Next: evaluator-optimizer and orchestrator-worker patterns.

#llm #multi-agent #python

Routing and Parallelization Patterns for AI Agents

The Routing Pattern

Why Route?

Two Stages of Routing

Stage 1: Classification

Stage 2: Task Dispatch

Complete Routing Implementation

Advanced: Routing with Orchestration

The Parallelization Pattern

The Golden Rule: Independence

Task Decomposition Strategies

1. Sectioning (Sharding)

2. Aspect-Based Decomposition

3. Diversity/Voting

Aggregation Strategies

Implementation with Threading

Real-World Example: Contract Analysis

Combining Routing and Parallelization

Key Takeaways

Routing

Parallelization

Comments

Your browser is out-of-date!

Routing and Parallelization Patterns for AI Agents

The Routing Pattern

Why Route?

Two Stages of Routing

Stage 1: Classification

Stage 2: Task Dispatch

Complete Routing Implementation

Advanced: Routing with Orchestration

The Parallelization Pattern

The Golden Rule: Independence

Task Decomposition Strategies

1. Sectioning (Sharding)

2. Aspect-Based Decomposition

3. Diversity/Voting

Aggregation Strategies

Implementation with Threading

Real-World Example: Contract Analysis

Combining Routing and Parallelization

Key Takeaways

Routing

Parallelization

Related Posts

Comments

Your browser is out-of-date!