Prompt Chaining Workflows - Sequential Task Decomposition

Nov 5 2025 AI agentic-ai

When a single prompt isn’t enough, we chain them together. Prompt chaining is one of the most practical patterns for building AI workflows - breaking complex tasks into focused steps where each agent’s output feeds into the next. In this post, I’ll explore how to design, validate, and implement effective prompt chains.

The Assembly Line Analogy

Think of prompt chaining like a manufacturing assembly line. Instead of one worker trying to build an entire car, specialized stations handle specific tasks in sequence - welding, painting, assembly, inspection. Each station does one thing well, and the product flows from one to the next.

flowchart LR
    I[Input] --> A1[Agent 1
Research]
    A1 --> A2[Agent 2
Analyze]
    A2 --> A3[Agent 3
Draft]
    A3 --> A4[Agent 4
Review]
    A4 --> O[Output]

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
    classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff

    class A1 blueClass
    class A2 orangeClass
    class A3 greenClass
    class A4 pinkClass

This approach offers several advantages:

Specialization: Each agent focuses on one task and can be optimized for it
Debuggability: When something fails, you know exactly which step caused it
Maintainability: Individual prompts can be improved without rewriting everything
Quality: Focused tasks produce more reliable outputs

The Challenge: Error Propagation

Here’s the catch with sequential chains - errors compound. If Agent 1 produces flawed output, that flaw propagates through Agents 2, 3, and 4, potentially getting worse at each step.

flowchart LR
    A1[Agent 1] -->|Error| A2[Agent 2]
    A2 -->|Error + More Error| A3[Agent 3]
    A3 -->|Compounded Errors| O[Bad Output]

    classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff

    class A1 pinkClass
    class A2 pinkClass
    class A3 pinkClass
    class O pinkClass

This means we need validation between steps - quality gates that catch problems before they cascade.

Validation Strategies

I use four main approaches to validate outputs between chain steps:

1. Programmatic Checks

Simple code-based validation for structural requirements:

def validate_json_output(output: str) -> bool:
    try:
        data = json.loads(output)
        required_fields = ["title", "summary", "key_points"]
        return all(field in data for field in required_fields)
    except json.JSONDecodeError:
        return False

2. LLM-Based Validation

Use another LLM call to assess quality:

def validate_with_llm(output: str, criteria: str) -> dict:
    prompt = f"""
    Evaluate this output against the criteria.

    Output: {output}
    Criteria: {criteria}

    Respond with JSON: {{"valid": true/false, "issues": [...]}}
    """
    return call_llm(prompt)

3. Rule-Based Validation

Check against specific business rules:

def validate_financial_report(report: str) -> bool:
    # No speculative language
    speculative_terms = ["might", "could potentially", "possibly"]
    if any(term in report.lower() for term in speculative_terms):
        return False

    # Must include required sections
    required_sections = ["Executive Summary", "Risk Assessment"]
    return all(section in report for section in required_sections)

4. Confidence Scoring

Have the model rate its own confidence:

PROMPT_WITH_CONFIDENCE = """
Complete this task: {task}

At the end, rate your confidence (1-10) and explain any uncertainties.
Format:
RESPONSE: [your response]
CONFIDENCE: [1-10]
UNCERTAINTIES: [list any concerns]
"""

Error Handling Approaches

When validation fails, we have options:

Strategy	When to Use	Implementation
Retry	Transient failures	Run the same step again
Re-prompt with feedback	Fixable errors	Include failure reason in new prompt
Fallback	Non-critical steps	Use default or skip
Critique & refine	Quality issues	Ask LLM to improve its output
Escalate	Critical failures	Alert human or halt workflow

Re-prompt with Feedback Example

def run_with_retry(prompt: str, validator, max_attempts: int = 3) -> str:
    feedback = None

    for attempt in range(max_attempts):
        if feedback:
            full_prompt = f"{prompt}\n\nPrevious attempt failed:\n{feedback}\nPlease fix these issues."
        else:
            full_prompt = prompt

        output = call_llm(full_prompt)
        validation_result = validator(output)

        if validation_result["valid"]:
            return output

        feedback = validation_result["issues"]

    raise ValueError(f"Failed after {max_attempts} attempts")

Context Management

A subtle but critical aspect of chaining is managing what context passes between steps. Too little context and agents lack necessary information; too much and you waste tokens and risk confusing the model.

Selective Context Passing

Pass only what the next agent needs:

def research_agent(topic: str) -> dict:
    result = call_llm(f"Research: {topic}")
    # Return structured data, not raw text
    return {
        "key_facts": extract_facts(result),
        "sources": extract_sources(result),
        "summary": summarize(result)
    }

def analysis_agent(research: dict) -> str:
    # Only pass the summary and key facts, not raw research
    prompt = f"""
    Analyze these research findings:

    Key Facts: {research['key_facts']}
    Summary: {research['summary']}

    Provide strategic recommendations.
    """
    return call_llm(prompt)

Contextual Reiteration

Sometimes repeating key constraints at each step prevents drift:

SHARED_CONTEXT = """
Target audience: Technical professionals
Tone: Professional but accessible
Length constraint: Maximum 500 words per section
"""

def agent_step(task: str, previous_output: str) -> str:
    prompt = f"""
    {SHARED_CONTEXT}

    Previous step output:
    {previous_output}

    Your task:
    {task}
    """
    return call_llm(prompt)

Implementation: Article Generation Chain

Let’s implement a practical two-agent chain that researches a topic and writes an article:

from openai import OpenAI

client = OpenAI()

def call_llm(system_prompt: str, user_prompt: str, temperature: float = 0.7) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        temperature=temperature
    )
    return response.choices[0].message.content


def research_agent(topic: str) -> str:
    """Agent 1: Research and gather information"""

    system_prompt = """
    You are a research specialist. Your job is to gather comprehensive
    information about topics and present findings in a structured format.

    Always include:
    - Key concepts and definitions
    - Current trends and developments
    - Notable examples or case studies
    - Potential challenges or controversies
    """

    user_prompt = f"""
    Research the following topic thoroughly:

    Topic: {topic}

    Provide your findings with clear headings for each section.
    """

    return call_llm(system_prompt, user_prompt, temperature=0.7)


def writer_agent(research: str, topic: str) -> str:
    """Agent 2: Transform research into polished article"""

    system_prompt = """
    You are a skilled technical writer. Transform research findings into
    engaging, well-structured articles.

    Your writing should be:
    - Clear and accessible to technical professionals
    - Well-organized with logical flow
    - Supported by the research provided
    - Engaging without being sensationalist
    """

    user_prompt = f"""
    Write a comprehensive article about "{topic}" using this research:

    {research}

    Structure the article with:
    - Compelling introduction
    - Clear sections with headers
    - Practical insights
    - Strong conclusion
    """

    return call_llm(system_prompt, user_prompt, temperature=0.7)


def generate_article(topic: str) -> str:
    """Orchestrate the two-agent chain"""

    print(f"Step 1: Researching '{topic}'...")
    research = research_agent(topic)

    print("Step 2: Writing article...")
    article = writer_agent(research, topic)

    return article


# Usage
article = generate_article("The impact of AI agents on software development")
print(article)

Advanced Pattern: Multi-Source Convergence

Sometimes chains aren’t strictly linear - multiple paths converge into one:

flowchart TD
    I[Input] --> A1[Market Research]
    I --> A2[Technical Analysis]
    A1 --> C[Synthesizer]
    A2 --> C
    C --> O[Final Report]

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class A1 blueClass
    class A2 orangeClass
    class C greenClass

Implementation:

def multi_source_analysis(topic: str) -> str:
    # Parallel research (could use threading for actual parallelism)
    market_research = market_analyst(topic)
    technical_analysis = tech_analyst(topic)

    # Convergence step
    synthesis_prompt = f"""
    Synthesize these two analyses into a comprehensive report:

    Market Analysis:
    {market_research}

    Technical Analysis:
    {technical_analysis}

    Create a unified report that integrates both perspectives.
    """

    return call_llm(synthesis_prompt)

Best Practices for Prompt Chains

From building various chains, I’ve developed these guidelines:

1. Use Structured Outputs
Have agents output consistent formats (JSON, specific headings) so downstream agents know what to expect.

system_prompt = """
Always structure your output as:
## Analysis
[your analysis]

## Key Points
- Point 1
- Point 2

## Recommendations
[your recommendations]
"""

2. Make Handoffs Explicit
Clearly define what each agent receives and produces:

def agent_a(input_data: str) -> dict:
    """
    Input: Raw customer feedback text
    Output: {
        "sentiment": "positive/negative/neutral",
        "topics": ["topic1", "topic2"],
        "summary": "brief summary"
    }
    """
    # Implementation

3. Log Everything
Capture inputs and outputs at each step for debugging:

def logged_chain_step(step_name: str, agent_fn, input_data):
    print(f"[{step_name}] Input: {input_data[:100]}...")
    output = agent_fn(input_data)
    print(f"[{step_name}] Output: {output[:100]}...")
    return output

4. Set Appropriate Temperatures

Factual/analytical steps: Lower temperature (0.3-0.5)
Creative steps: Higher temperature (0.7-0.9)
Deterministic extraction: Temperature 0

Key Takeaways

Chain for complexity: Break multi-step tasks into focused, sequential agents
Validate between steps: Catch errors before they propagate
Manage context carefully: Pass what’s needed, not everything
Handle failures gracefully: Retry with feedback when possible
Structure outputs consistently: Makes downstream processing reliable

Prompt chaining handles sequential dependencies well, but what about tasks that don’t have a clear linear path? In the next post, I’ll explore routing patterns - dynamically directing tasks to specialized agents based on their content.

This is Part 5 of my series on building intelligent AI systems. Next: routing patterns and parallelization.

#llm #multi-agent #python

Prompt Chaining Workflows - Sequential Task Decomposition

The Assembly Line Analogy

The Challenge: Error Propagation

Validation Strategies

1. Programmatic Checks

2. LLM-Based Validation

3. Rule-Based Validation

4. Confidence Scoring

Error Handling Approaches

Re-prompt with Feedback Example

Context Management

Selective Context Passing

Contextual Reiteration

Implementation: Article Generation Chain

Advanced Pattern: Multi-Source Convergence

Best Practices for Prompt Chains

Key Takeaways

Comments

Your browser is out-of-date!

Prompt Chaining Workflows - Sequential Task Decomposition

The Assembly Line Analogy

The Challenge: Error Propagation

Validation Strategies

1. Programmatic Checks

2. LLM-Based Validation

3. Rule-Based Validation

4. Confidence Scoring

Error Handling Approaches

Re-prompt with Feedback Example

Context Management

Selective Context Passing

Contextual Reiteration

Implementation: Article Generation Chain

Advanced Pattern: Multi-Source Convergence

Best Practices for Prompt Chains

Key Takeaways

Related Posts

Comments

Your browser is out-of-date!