Prompt Chaining Workflows - Sequential Task Decomposition

When a single prompt isn’t enough, we chain them together. Prompt chaining is one of the most practical patterns for building AI workflows - breaking complex tasks into focused steps where each agent’s output feeds into the next. In this post, I’ll explore how to design, validate, and implement effective prompt chains.

The Assembly Line Analogy

Think of prompt chaining like a manufacturing assembly line. Instead of one worker trying to build an entire car, specialized stations handle specific tasks in sequence - welding, painting, assembly, inspection. Each station does one thing well, and the product flows from one to the next.

flowchart LR
    I[Input] --> A1[Agent 1
Research] A1 --> A2[Agent 2
Analyze] A2 --> A3[Agent 3
Draft] A3 --> A4[Agent 4
Review] A4 --> O[Output] style A1 fill:#e3f2fd style A2 fill:#fff3e0 style A3 fill:#e8f5e9 style A4 fill:#fce4ec

This approach offers several advantages:

  • Specialization: Each agent focuses on one task and can be optimized for it
  • Debuggability: When something fails, you know exactly which step caused it
  • Maintainability: Individual prompts can be improved without rewriting everything
  • Quality: Focused tasks produce more reliable outputs

The Challenge: Error Propagation

Here’s the catch with sequential chains - errors compound. If Agent 1 produces flawed output, that flaw propagates through Agents 2, 3, and 4, potentially getting worse at each step.

flowchart LR
    A1[Agent 1] -->|Error| A2[Agent 2]
    A2 -->|Error + More Error| A3[Agent 3]
    A3 -->|Compounded Errors| O[Bad Output]

    style A1 fill:#ffcdd2
    style A2 fill:#ef9a9a
    style A3 fill:#e57373
    style O fill:#c62828,color:#fff

This means we need validation between steps - quality gates that catch problems before they cascade.

Validation Strategies

I use four main approaches to validate outputs between chain steps:

1. Programmatic Checks

Simple code-based validation for structural requirements:

1
2
3
4
5
6
7
def validate_json_output(output: str) -> bool:
try:
data = json.loads(output)
required_fields = ["title", "summary", "key_points"]
return all(field in data for field in required_fields)
except json.JSONDecodeError:
return False

2. LLM-Based Validation

Use another LLM call to assess quality:

1
2
3
4
5
6
7
8
9
10
def validate_with_llm(output: str, criteria: str) -> dict:
prompt = f"""
Evaluate this output against the criteria.

Output: {output}
Criteria: {criteria}

Respond with JSON: {{"valid": true/false, "issues": [...]}}
"""
return call_llm(prompt)

3. Rule-Based Validation

Check against specific business rules:

1
2
3
4
5
6
7
8
9
def validate_financial_report(report: str) -> bool:
# No speculative language
speculative_terms = ["might", "could potentially", "possibly"]
if any(term in report.lower() for term in speculative_terms):
return False

# Must include required sections
required_sections = ["Executive Summary", "Risk Assessment"]
return all(section in report for section in required_sections)

4. Confidence Scoring

Have the model rate its own confidence:

1
2
3
4
5
6
7
8
9
PROMPT_WITH_CONFIDENCE = """
Complete this task: {task}

At the end, rate your confidence (1-10) and explain any uncertainties.
Format:
RESPONSE: [your response]
CONFIDENCE: [1-10]
UNCERTAINTIES: [list any concerns]
"""

Error Handling Approaches

When validation fails, we have options:

Strategy When to Use Implementation
Retry Transient failures Run the same step again
Re-prompt with feedback Fixable errors Include failure reason in new prompt
Fallback Non-critical steps Use default or skip
Critique & refine Quality issues Ask LLM to improve its output
Escalate Critical failures Alert human or halt workflow

Re-prompt with Feedback Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def run_with_retry(prompt: str, validator, max_attempts: int = 3) -> str:
feedback = None

for attempt in range(max_attempts):
if feedback:
full_prompt = f"{prompt}\n\nPrevious attempt failed:\n{feedback}\nPlease fix these issues."
else:
full_prompt = prompt

output = call_llm(full_prompt)
validation_result = validator(output)

if validation_result["valid"]:
return output

feedback = validation_result["issues"]

raise ValueError(f"Failed after {max_attempts} attempts")

Context Management

A subtle but critical aspect of chaining is managing what context passes between steps. Too little context and agents lack necessary information; too much and you waste tokens and risk confusing the model.

Selective Context Passing

Pass only what the next agent needs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def research_agent(topic: str) -> dict:
result = call_llm(f"Research: {topic}")
# Return structured data, not raw text
return {
"key_facts": extract_facts(result),
"sources": extract_sources(result),
"summary": summarize(result)
}

def analysis_agent(research: dict) -> str:
# Only pass the summary and key facts, not raw research
prompt = f"""
Analyze these research findings:

Key Facts: {research['key_facts']}
Summary: {research['summary']}

Provide strategic recommendations.
"""
return call_llm(prompt)

Contextual Reiteration

Sometimes repeating key constraints at each step prevents drift:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SHARED_CONTEXT = """
Target audience: Technical professionals
Tone: Professional but accessible
Length constraint: Maximum 500 words per section
"""

def agent_step(task: str, previous_output: str) -> str:
prompt = f"""
{SHARED_CONTEXT}

Previous step output:
{previous_output}

Your task:
{task}
"""
return call_llm(prompt)

Implementation: Article Generation Chain

Let’s implement a practical two-agent chain that researches a topic and writes an article:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
from openai import OpenAI

client = OpenAI()

def call_llm(system_prompt: str, user_prompt: str, temperature: float = 0.7) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=temperature
)
return response.choices[0].message.content


def research_agent(topic: str) -> str:
"""Agent 1: Research and gather information"""

system_prompt = """
You are a research specialist. Your job is to gather comprehensive
information about topics and present findings in a structured format.

Always include:
- Key concepts and definitions
- Current trends and developments
- Notable examples or case studies
- Potential challenges or controversies
"""

user_prompt = f"""
Research the following topic thoroughly:

Topic: {topic}

Provide your findings with clear headings for each section.
"""

return call_llm(system_prompt, user_prompt, temperature=0.7)


def writer_agent(research: str, topic: str) -> str:
"""Agent 2: Transform research into polished article"""

system_prompt = """
You are a skilled technical writer. Transform research findings into
engaging, well-structured articles.

Your writing should be:
- Clear and accessible to technical professionals
- Well-organized with logical flow
- Supported by the research provided
- Engaging without being sensationalist
"""

user_prompt = f"""
Write a comprehensive article about "{topic}" using this research:

{research}

Structure the article with:
- Compelling introduction
- Clear sections with headers
- Practical insights
- Strong conclusion
"""

return call_llm(system_prompt, user_prompt, temperature=0.7)


def generate_article(topic: str) -> str:
"""Orchestrate the two-agent chain"""

print(f"Step 1: Researching '{topic}'...")
research = research_agent(topic)

print("Step 2: Writing article...")
article = writer_agent(research, topic)

return article


# Usage
article = generate_article("The impact of AI agents on software development")
print(article)

Advanced Pattern: Multi-Source Convergence

Sometimes chains aren’t strictly linear - multiple paths converge into one:

flowchart TD
    I[Input] --> A1[Market Research]
    I --> A2[Technical Analysis]
    A1 --> C[Synthesizer]
    A2 --> C
    C --> O[Final Report]

    style A1 fill:#e3f2fd
    style A2 fill:#fff3e0
    style C fill:#e8f5e9

Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def multi_source_analysis(topic: str) -> str:
# Parallel research (could use threading for actual parallelism)
market_research = market_analyst(topic)
technical_analysis = tech_analyst(topic)

# Convergence step
synthesis_prompt = f"""
Synthesize these two analyses into a comprehensive report:

Market Analysis:
{market_research}

Technical Analysis:
{technical_analysis}

Create a unified report that integrates both perspectives.
"""

return call_llm(synthesis_prompt)

Best Practices for Prompt Chains

From building various chains, I’ve developed these guidelines:

1. Use Structured Outputs
Have agents output consistent formats (JSON, specific headings) so downstream agents know what to expect.

1
2
3
4
5
6
7
8
9
10
11
12
system_prompt = """
Always structure your output as:
## Analysis
[your analysis]

## Key Points
- Point 1
- Point 2

## Recommendations
[your recommendations]
"""

2. Make Handoffs Explicit
Clearly define what each agent receives and produces:

1
2
3
4
5
6
7
8
9
10
def agent_a(input_data: str) -> dict:
"""
Input: Raw customer feedback text
Output: {
"sentiment": "positive/negative/neutral",
"topics": ["topic1", "topic2"],
"summary": "brief summary"
}
"""
# Implementation

3. Log Everything
Capture inputs and outputs at each step for debugging:

1
2
3
4
5
def logged_chain_step(step_name: str, agent_fn, input_data):
print(f"[{step_name}] Input: {input_data[:100]}...")
output = agent_fn(input_data)
print(f"[{step_name}] Output: {output[:100]}...")
return output

4. Set Appropriate Temperatures

  • Factual/analytical steps: Lower temperature (0.3-0.5)
  • Creative steps: Higher temperature (0.7-0.9)
  • Deterministic extraction: Temperature 0

Key Takeaways

  1. Chain for complexity: Break multi-step tasks into focused, sequential agents
  2. Validate between steps: Catch errors before they propagate
  3. Manage context carefully: Pass what’s needed, not everything
  4. Handle failures gracefully: Retry with feedback when possible
  5. Structure outputs consistently: Makes downstream processing reliable

Prompt chaining handles sequential dependencies well, but what about tasks that don’t have a clear linear path? In the next post, I’ll explore routing patterns - dynamically directing tasks to specialized agents based on their content.


This is Part 5 of my series on building intelligent AI systems. Next: routing patterns and parallelization.

Routing and Parallelization Patterns for AI Agents Anatomy of an AI Agent - Building Blocks and Workflows

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×