Traditional RAG (Retrieval-Augmented Generation) follows a fixed pattern: query in, documents out, response generated. But what if the agent could decide when and how to retrieve? Agentic RAG gives agents control over their own knowledge acquisition. In this post, I’ll explore this dynamic approach to retrieval, then tackle the equally important question: how do we know if our agents actually work?
Connecting Agents to the World - External APIs and Data
An agent that can only process text is fundamentally limited. Real usefulness comes from connecting to external systems - fetching live data, querying databases, calling APIs, and triggering actions in the real world. In this post, I’ll explore how to build these connections, turning isolated language models into integrated systems that can actually get things done.
Agent State and Memory - Beyond Single Interactions
A stateless agent treats every interaction as its first - no memory of previous conversations, no awareness of ongoing tasks, no accumulated context. While this works for simple Q&A, real-world applications demand more. In this post, I’ll explore how to give agents memory through state management, enabling them to maintain context across interactions and handle complex multi-step workflows.
Extending Agents with Tools and Structured Outputs
Language models are impressive reasoners, but without tools they can only generate text. They can’t check real-time data, perform precise calculations, or interact with external systems. In this post, I’ll explore how to extend agents with tools through function calling, and ensure reliable outputs using Pydantic for structured data validation.
Evaluator-Optimizer and Orchestrator-Worker Patterns
Some tasks require iteration - generating, evaluating, and refining until quality standards are met. Others need dynamic orchestration - a central coordinator breaking down novel problems and delegating to specialists. In this post, I’ll cover two sophisticated patterns that enable these capabilities: the evaluator-optimizer loop and the orchestrator-worker architecture.
Routing and Parallelization Patterns for AI Agents
Sequential chains work well when tasks have a clear order, but real-world problems often require more flexibility. Sometimes you need to route tasks to different specialists based on their content. Other times, multiple agents should work simultaneously on different aspects of a problem. In this post, I’ll cover two powerful patterns: routing for intelligent task dispatch and parallelization for concurrent processing.
Prompt Chaining Workflows - Sequential Task Decomposition
When a single prompt isn’t enough, we chain them together. Prompt chaining is one of the most practical patterns for building AI workflows - breaking complex tasks into focused steps where each agent’s output feeds into the next. In this post, I’ll explore how to design, validate, and implement effective prompt chains.
Anatomy of an AI Agent - Building Blocks and Workflows
Moving beyond simple prompting techniques, it’s time to examine what actually makes an AI agent tick. In this post, I’ll break down the core components that transform a language model from a sophisticated autocomplete into an autonomous problem-solver, and explore how to model and implement agent workflows.