Some tasks require iteration - generating, evaluating, and refining until quality standards are met. Others need dynamic orchestration - a central coordinator breaking down novel problems and delegating to specialists. In this post, I’ll cover two sophisticated patterns that enable these capabilities: the evaluator-optimizer loop and the orchestrator-worker architecture.
The Evaluator-Optimizer Pattern
This pattern creates an iterative refinement loop where two agents collaborate: one generates content, the other critiques it, and the feedback drives improvement.
flowchart TD
I[Input] --> O[Optimizer Generate]
O --> E{Evaluator Assess}
E -->|Passes| Done[Output]
E -->|Fails| F[Feedback]
F --> O
style O fill:#e3f2fd
style E fill:#fff3e0
style Done fill:#c8e6c9
Think of it like a writer-editor relationship: the writer drafts, the editor critiques, the writer revises based on feedback, and this continues until publication standards are met.
Key Components
Optimizer (Generator) Agent: Creates initial output and refines based on feedback. Uses moderate-to-high temperature for creativity.
Evaluator (Critic) Agent: Assesses output against criteria and provides actionable feedback. Uses low temperature for consistent evaluation.
Three Critical Elements
1. Clear Evaluation Criteria
Vague criteria produce vague feedback. Be specific:
1 2 3 4 5 6 7 8 9 10 11 12
# Bad: Vague criteria criteria = "Make sure it's good and professional"
# Good: Specific, measurable criteria criteria = """ Evaluate the report against these criteria: 1. Contains executive summary (max 200 words) 2. All claims supported by data 3. No speculative language ("might", "could potentially") 4. Includes risk assessment section 5. Professional tone throughout """
2. Actionable Feedback
Feedback should tell the optimizer what to fix and how:
1 2 3 4 5 6 7 8 9 10
# Bad: Non-actionable feedback = "The writing needs improvement"
# Good: Actionable feedback = """ Issues found: 1. Executive summary is 347 words (exceeds 200 limit) - condense key points 2. Paragraph 3 claims "significant growth" without supporting data - add specific metrics 3. Line 45 uses "might increase" - rephrase with certainty or remove """
3. Stopping Conditions
Without clear stopping conditions, loops run forever. Define when to stop:
defoptimizer_agent(task: str, feedback: str = None) -> str: """Generate or refine content based on feedback"""
system_prompt = """ You are a skilled content creator. Generate high-quality content that meets professional standards. If feedback is provided, carefully address each point in your revision. """
if feedback: user_prompt = f""" Task: {task} Previous feedback to address: {feedback} Create an improved version addressing all feedback points. """ else: user_prompt = f"Task: {task}"
defevaluator_agent(content: str, criteria: str) -> Tuple[bool, str]: """Evaluate content and provide feedback"""
system_prompt = """ You are a strict quality evaluator. Assess content against the provided criteria. Be thorough and specific in your feedback. Respond in this format: PASSED: true/false FEEDBACK: [specific issues and how to fix them, or "All criteria met"] """
user_prompt = f""" Criteria: {criteria} Content to evaluate: {content} """
defevaluator_optimizer_loop( task: str, criteria: str, max_iterations: int = 5 ) -> str: """Run the evaluator-optimizer loop until success or max iterations"""
content = None feedback = None
for iteration inrange(max_iterations): print(f"Iteration {iteration + 1}/{max_iterations}")
# Generate or refine content = optimizer_agent(task, feedback)
if passed: print(f"Success on iteration {iteration + 1}") return content
print(f"Feedback: {feedback[:100]}...")
print(f"Max iterations reached, returning best effort") return content
# Usage task = "Write a professional email announcing a product delay to customers" criteria = """ 1. Acknowledges the delay with specific new timeline 2. Apologizes sincerely without making excuses 3. Explains what steps are being taken 4. Offers compensation or goodwill gesture 5. Under 200 words 6. Professional but empathetic tone """
While evaluator-optimizer handles iterative refinement, orchestrator-worker handles dynamic task decomposition. A central orchestrator analyzes complex problems, creates plans, delegates to specialists, and synthesizes results.
flowchart TD
I[Complex Task] --> O[Orchestrator]
O -->|Plan| P[Create Subtasks]
P --> W1[Worker 1]
P --> W2[Worker 2]
P --> W3[Worker 3]
W1 --> S[Synthesize]
W2 --> S
W3 --> S
S --> O
O --> R[Final Result]
style O fill:#fff3e0
style W1 fill:#e3f2fd
style W2 fill:#e8f5e9
style W3 fill:#fce4ec
Think of it like a project manager who receives a complex brief, breaks it into tasks, assigns specialists, collects their work, and assembles the final deliverable.
# Worker Agents defresearch_worker(subtask: str) -> str: system_prompt = "You are a research specialist. Gather and summarize relevant information." return call_llm(system_prompt, subtask)
defanalysis_worker(subtask: str) -> str: system_prompt = "You are a data analyst. Analyze information and identify patterns." return call_llm(system_prompt, subtask)
defwriting_worker(subtask: str) -> str: system_prompt = "You are a professional writer. Create clear, engaging content." return call_llm(system_prompt, subtask)
deforchestrator_agent(task: str) -> str: """Central orchestrator that plans, delegates, and synthesizes"""
# Step 1: Create plan planning_prompt = f""" Analyze this task and create a plan with subtasks. Task: {task} Available workers: research, analysis, writing Respond with JSON: {{ "subtasks": [ {{"worker": "research", "task": "specific subtask description"}}, {{"worker": "analysis", "task": "specific subtask description"}} ] }} """
plan_response = call_llm( "You are a project planner. Create efficient task breakdowns.", planning_prompt, temperature=0 )
# Parse plan try: # Extract JSON from response json_start = plan_response.find('{') json_end = plan_response.rfind('}') + 1 plan = json.loads(plan_response[json_start:json_end]) except json.JSONDecodeError: # Fallback to simple research task plan = {"subtasks": [{"worker": "research", "task": task}]}
# Step 2: Execute subtasks results = [] for subtask in plan["subtasks"]: worker_name = subtask["worker"] worker_task = subtask["task"]
print(f"Delegating to {worker_name}: {worker_task[:50]}...")
worker = WORKERS.get(worker_name, research_worker) result = worker(worker_task) results.append({ "worker": worker_name, "task": worker_task, "result": result })
# Step 3: Synthesize results synthesis_prompt = f""" Original task: {task} Worker results: {json.dumps(results, indent=2)} Synthesize these results into a comprehensive, coherent response that fully addresses the original task. """
final_result = call_llm( "You are an expert synthesizer. Combine diverse inputs into coherent outputs.", synthesis_prompt, temperature=0.5 )
return final_result
# Usage complex_task = """ Analyze the impact of remote work on software development teams. Include research on productivity studies, analysis of collaboration patterns, and recommendations for team leads. """
result = orchestrator_agent(complex_task) print(result)
Dynamic Planning in Action
For the task “Analyze remote work impact on dev teams”:
1 2 3 4 5 6 7 8 9
Orchestrator creates plan: 1. Research worker → "Find recent studies on remote work productivity" 2. Research worker → "Gather data on collaboration tool usage" 3. Analysis worker → "Compare productivity metrics pre/post remote" 4. Writing worker → "Draft recommendations for team leads"
Workers execute independently...
Orchestrator synthesizes into final report
Real-World Use Cases
Domain
Orchestrator Role
Workers
Market Analysis
Break down research request
News, competitors, trends
Medical Diagnosis
Coordinate test interpretation
Hematology, cardiology, radiology
Legal Review
Manage contract analysis
Compliance, liability, terms
Content Creation
Plan article structure
Research, writing, editing
Combining Patterns
These patterns can be composed for sophisticated workflows:
flowchart TD
T[Complex Task] --> O[Orchestrator]
O --> W1[Worker 1]
O --> W2[Worker 2]
W1 --> E1{Evaluate}
E1 -->|Fail| W1
E1 -->|Pass| S[Synthesize]
W2 --> E2{Evaluate}
E2 -->|Fail| W2
E2 -->|Pass| S
S --> O
O --> R[Result]
style O fill:#fff3e0
style E1 fill:#fce4ec
style E2 fill:#fce4ec
The orchestrator delegates to workers, each worker’s output goes through an evaluator-optimizer loop, and the orchestrator synthesizes validated results.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
deforchestrator_with_validation(task: str) -> str: # Get plan from orchestrator plan = create_plan(task)
validated_results = [] for subtask in plan["subtasks"]: # Each worker output goes through evaluation loop result = evaluator_optimizer_loop( task=subtask["task"], criteria=get_criteria_for_worker(subtask["worker"]), max_iterations=3 ) validated_results.append(result)
Dynamic decomposition: Break tasks at runtime, not design time
Specialized workers: Each expert in their domain
Central synthesis: Orchestrator combines results coherently
Handles novelty: Adapts to unfamiliar requests
Composition
Patterns can be combined for complex workflows
Orchestrator can use evaluator-optimizer for quality control
Workers can be simple agents or full sub-workflows
These workflow patterns - chaining, routing, parallelization, evaluation loops, and orchestration - form the foundation for building sophisticated AI systems. In the next part of this series, I’ll move from workflow patterns to building full-fledged agents with tools, state, and memory.
This wraps up the agentic workflow patterns. Next: extending agents with tools and function calling.
Comments