Mastering LangChain and LangGraph - A Practitioner's Guide

If you’ve been building with LLMs, you’ve likely encountered the gap between simple API calls and production-ready agent systems. LangChain and LangGraph bridge that gap, providing the abstractions and patterns needed to build reliable, maintainable AI applications. This series takes you from LangChain fundamentals to production multi-agent systems, focusing on practical implementation over theory.

What This Series Covers

This seven-part series progresses through the complete stack of building agentic AI systems with LangChain and LangGraph:

flowchart LR
    subgraph Foundation["Posts 1-2"]
        A[LangChain Core] --> B[LCEL & Agents]
    end

    subgraph Graphs["Post 3"]
        C[LangGraph Fundamentals]
    end

    subgraph Integration["Posts 4-5"]
        D[APIs & Databases] --> E[RAG & HITL]
    end

    subgraph Advanced["Posts 6-7"]
        F[Multi-Agent] --> G[Production]
    end

    Foundation --> Graphs --> Integration --> Advanced

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class A,B blueClass
    class C,D,E orangeClass
    class F,G greenClass
Post Topics
1. This Post LangChain architecture, prompts, messages, structured outputs
2. LCEL & Agents Expression language, chains, tools, ReAct agents
3. LangGraph Fundamentals StateGraph, nodes, edges, conditional routing
4. APIs & Databases External tools, SQL agents, security
5. RAG & Human-in-the-Loop Agentic RAG, checkpointing, observability
6. Multi-Agent Systems Orchestrator patterns, agent communication
7. Production Systems Deployment, monitoring, complete architectures

The Evolution from API Calls to Frameworks

When OpenAI released GPT-3 in 2020, developers began building LLM applications with direct API calls. A simple chatbot might look like making HTTP requests, parsing JSON responses, and manually managing conversation state. This approach works for prototypes but quickly becomes unwieldy.

Consider what you need beyond the API call itself:

  • State management: LLMs are stateless. Every API call is independent. You must explicitly pass conversation history, user context, and any accumulated knowledge.
  • Prompt engineering: Effective prompts require careful structure—system instructions, examples, output format specifications. Managing this as string concatenation leads to bugs.
  • Output parsing: LLMs return text. Extracting structured data (JSON, dates, numbers) from that text requires parsing logic that handles edge cases.
  • Error handling: APIs fail, rate limits kick in, responses get malformed. Production systems need retry logic, fallbacks, and graceful degradation.
  • Orchestration: Complex tasks require multiple LLM calls, possibly to different models, with logic determining what to do next based on previous outputs.

LangChain emerged to address these challenges by providing composable abstractions. Rather than reinventing the wheel for each application, you build with standardized components that handle common patterns.

Why LangChain?

LangChain provides a framework for building LLM-powered applications with three core principles:

1. Provider Abstraction: The same code works across OpenAI, Anthropic, Google, and dozens of other providers. Switching models requires changing one line, not rewriting your application.

2. Composability: Components connect together like LEGO blocks. A prompt template feeds into a model, which feeds into a parser. These chains can be combined, nested, and reused.

3. Convention Over Configuration: LangChain establishes patterns for common tasks—chat history, tool calling, output parsing—so you don’t make decisions that have well-established best practices.

The ecosystem has evolved significantly. LangChain started as a monolithic library but has been modularized:

  • langchain-core: The essential abstractions (messages, prompts, runnables)
  • langchain-openai, langchain-anthropic, etc.: Provider-specific integrations
  • langchain-community: Third-party integrations
  • langgraph: Graph-based agent orchestration (covered in Post 3)

This modularity means you install only what you need, reducing dependencies and conflicts.

Core Architecture

LangChain’s architecture centers on three concepts that mirror how humans structure communication with LLMs:

graph TD
    subgraph Models["Models"]
        M1[ChatOpenAI]
        M2[ChatAnthropic]
        M3[Other Providers]
    end

    subgraph Messages["Messages"]
        MSG1[SystemMessage]
        MSG2[HumanMessage]
        MSG3[AIMessage]
        MSG4[ToolMessage]
    end

    subgraph Chains["Chains"]
        C1[PromptTemplate]
        C2[Model]
        C3[OutputParser]
    end

    Messages --> Models
    C1 --> C2 --> C3

    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff

    class M1,M2,M3 blueClass
    class MSG1,MSG2,MSG3,MSG4 orangeClass
    class C1,C2,C3 greenClass

Models: The Unified Interface

Every LLM provider has its own API format, authentication mechanism, and response structure. LangChain’s model wrappers normalize these differences behind a consistent interface.

1
2
3
4
5
6
7
8
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

gpt4 = ChatOpenAI(model="gpt-4o")
claude = ChatAnthropic(model="claude-sonnet-4-20250514")

# Both use identical invoke() method
response = gpt4.invoke("What is the capital of France?")

This abstraction enables powerful patterns. You can:

  • A/B test models by swapping the model object
  • Fall back to cheaper models when the primary is overloaded
  • Run evaluations across multiple providers with identical prompts

The invoke() method is part of a broader Runnable interface that includes batch() for parallel processing, stream() for token-by-token output, and async variants (ainvoke(), abatch(), astream()).

Messages: Structured Communication

Raw text prompts don’t capture the full context of a conversation. Chat models distinguish between different types of input, and LangChain makes this explicit through message objects.

Message Type Purpose Example
SystemMessage Sets model behavior and context “You are a helpful assistant specializing in Python.”
HumanMessage User input “How do I read a CSV file?”
AIMessage Model responses “You can use pandas: pd.read_csv('file.csv')
ToolMessage Results from tool execution “Weather API returned: 72°F, sunny”

This structure matters because chat models are trained to understand conversational roles. A system message has different weight than a human message. The model knows that AIMessage content came from itself in a previous turn.

1
2
3
4
5
6
7
8
9
from langchain_core.messages import SystemMessage, HumanMessage

messages = [
SystemMessage(content="You are a geography tutor. Give concise answers."),
HumanMessage(content="What's the capital of Brazil?"),
]

response = llm.invoke(messages)
# AIMessage with content="Brasília."

The message structure also enables few-shot prompting—you can include example Human/AI exchanges to demonstrate the desired behavior before the actual question.

Prompt Templates: Beyond String Formatting

String concatenation for prompts is deceptively dangerous. Consider:

1
2
# Fragile approach
prompt = f"Summarize this article about {topic}:\n\n{article}"

This breaks when topic contains newlines, when article is empty, or when you need to modify the format. Prompt templates solve this with structured, reusable definitions.

The Template Hierarchy

LangChain offers templates at different levels of complexity:

PromptTemplate: Simple variable substitution for non-chat models or single-turn interactions.

1
2
3
4
5
6
from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
"Translate the following to {language}: {text}"
)
prompt = template.format(language="Spanish", text="Hello, world")

ChatPromptTemplate: Multi-turn conversations with role-specific messages.

1
2
3
4
5
6
from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
("system", "You are an expert in {domain}."),
("human", "{question}"),
])

FewShotPromptTemplate: Includes examples to guide model behavior.

The power of templates becomes apparent in production. You can:

  • Version control your prompts separately from code
  • Validate that required variables are provided
  • Compose templates into larger structures
  • Inspect the full prompt for debugging

Few-Shot Learning: Teaching by Example

Few-shot prompting dramatically improves output quality for specific formats or domains. Instead of describing what you want, you show examples.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

examples = [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"},
{"input": "fast", "output": "slow"},
]

example_template = PromptTemplate.from_template(
"Input: {input}\nOutput: {output}"
)

few_shot = FewShotPromptTemplate(
examples=examples,
example_prompt=example_template,
prefix="Give the antonym of each word:",
suffix="Input: {word}\nOutput:",
input_variables=["word"],
)

The tradeoff is token usage. Each example consumes context window space. Start with 2-3 examples and add more only if quality improves. For highly specialized tasks, consider fine-tuning instead of ever-larger few-shot prompts.

Managing Conversation State

LLMs have no memory. Every invocation is independent—the model doesn’t know what you discussed five seconds ago unless you tell it. This is a fundamental constraint with important implications.

Consider a customer service chatbot:

  1. User: “I’d like to return my order”
  2. Bot: “I can help with that. What’s your order number?”
  3. User: “12345”
  4. Bot: ??? (Needs to know we’re discussing a return, not a new question)

Without explicit state management, step 4 loses all context. The model sees only “12345” with no knowledge of the return discussion.

Explicit History Management

The simplest approach maintains a list of messages:

1
2
3
4
5
6
7
8
9
from langchain_core.messages import HumanMessage, AIMessage

history = []

def chat(user_message: str) -> str:
history.append(HumanMessage(content=user_message))
response = llm.invoke(history)
history.append(response)
return response.content

Every message—human and AI—gets appended. The full history is passed to each invocation. The model sees the entire conversation and can reference earlier context.

This approach has limitations:

  • Context window limits: Models have maximum token counts. Long conversations get truncated.
  • Cost: More tokens = higher API costs.
  • Latency: Longer prompts take longer to process.

Production systems implement context window management:

  • Summarize older messages
  • Keep only the last N turns
  • Store full history in a database but send condensed versions to the model

LangChain provides MessageHistory classes that integrate with various storage backends (PostgreSQL, Redis, local files) for persistent, managed conversation state.

Structured Outputs: From Text to Data

Free-form text is problematic for applications. When your code needs to make decisions based on LLM output, parsing natural language is fragile.

1
2
3
4
5
# The model might respond:
"The sentiment is positive"
"I'd say positive"
"Positive sentiment detected"
"It's definitely positive!"

All mean the same thing, but extracting “positive” reliably requires fuzzy matching or another LLM call.

The Structured Output Solution

Modern LLMs support constrained generation—forcing output to conform to a schema. LangChain exposes this through Pydantic models.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from pydantic import BaseModel, Field
from typing import Literal

class SentimentResult(BaseModel):
"""Analysis result for text sentiment."""
sentiment: Literal["positive", "negative", "neutral"]
confidence: float = Field(ge=0, le=1)
reasoning: str

structured_llm = llm.with_structured_output(SentimentResult)
result = structured_llm.invoke("Analyze: 'This product is amazing!'")

# result.sentiment == "positive"
# result.confidence == 0.95
# result.reasoning == "Strong positive language..."

The model now returns a Python object with typed fields. No parsing required. Invalid outputs raise exceptions rather than silently corrupting your application.

When Structured Output Isn’t Available

Not all models or configurations support constrained generation. LangChain provides output parsers as a fallback:

Parser Use Case
StrOutputParser Extract plain text from AIMessage
JsonOutputParser Parse JSON from model output
PydanticOutputParser Validate against Pydantic schema
OutputFixingParser Auto-retry with LLM to fix malformed output

The OutputFixingParser is particularly useful—it catches parse errors, sends the malformed output back to the LLM with correction instructions, and retries. This adds latency but significantly improves reliability.

The Runnable Protocol

Everything in LangChain implements the Runnable interface. This consistency enables powerful composition patterns we’ll explore in the next post.

1
2
3
4
5
# All runnables share these methods:
runnable.invoke(input) # Single input → single output
runnable.batch([inputs]) # Multiple inputs → multiple outputs (parallel)
runnable.stream(input) # Single input → streaming output
runnable.ainvoke(input) # Async single input

Prompts are runnables. Models are runnables. Parsers are runnables. This means you can chain them together, run them in parallel, or swap components without changing the surrounding code.

Key Takeaways

  1. LangChain solves the orchestration problem: Beyond API calls, you need state management, prompt engineering, output parsing, and error handling. LangChain provides standardized solutions.

  2. Provider abstraction enables flexibility: The same code works across OpenAI, Anthropic, and other providers. Switching models is a one-line change.

  3. Messages structure conversations: SystemMessage, HumanMessage, AIMessage, and ToolMessage capture the full context of multi-turn interactions.

  4. Templates prevent prompt chaos: Structured, reusable prompt definitions with variable substitution and composition.

  5. Structured outputs bridge LLM and code: Pydantic models with with_structured_output() guarantee typed, validated responses that your application can trust.

  6. The Runnable interface enables composition: Every component shares invoke(), batch(), and stream(), making them interchangeable building blocks.


Next: Building Agents with LCEL and Tool Integration - We’ll explore the LangChain Expression Language for composing chains and building tool-equipped agents that can take actions in the world.

TIL: Setup MCP Server in Claude Code Building Agents with LCEL and Tool Integration

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×