Agent State and Memory - Beyond Single Interactions

A stateless agent treats every interaction as its first - no memory of previous conversations, no awareness of ongoing tasks, no accumulated context. While this works for simple Q&A, real-world applications demand more. In this post, I’ll explore how to give agents memory through state management, enabling them to maintain context across interactions and handle complex multi-step workflows.

The Stateless Problem

Consider a simple travel agent that helps book trips. In a stateless world:

1
2
3
4
5
6
7
User: I want to fly to Tokyo next month
Agent: I can help! When exactly would you like to travel?

User: The 15th
Agent: I'd be happy to help with flights on the 15th. Where are you traveling to?

User: I just told you - Tokyo!

The agent has forgotten everything between messages. Each turn is isolated, leading to frustrating user experiences and broken workflows.

What is Agent State?

State is the information an agent retains between steps in a workflow. It encompasses:

  • Conversation history: What was said before
  • Task progress: Current step in a multi-step process
  • Gathered data: Information collected along the way
  • User preferences: Learned details about the user
  • Pending actions: What still needs to be done
flowchart TD
    subgraph State["Agent State"]
        H[History]
        T[Task Progress]
        D[Collected Data]
        P[Preferences]
    end

    I[User Input] --> A[Agent]
    State --> A
    A --> O[Response]
    A --> State

    style State fill:#e3f2fd
    style A fill:#fff3e0

The agent reads from state to understand context, then writes back to state to remember what happened.

State as a Graph

One powerful mental model is treating agent state as a graph structure. Nodes represent states, edges represent transitions triggered by actions or events.

stateDiagram-v2
    [*] --> Greeting
    Greeting --> CollectingInfo: User provides details
    CollectingInfo --> Searching: All info gathered
    Searching --> Presenting: Results found
    Presenting --> Booking: User selects option
    Booking --> Confirmed: Payment processed
    Confirmed --> [*]

    CollectingInfo --> CollectingInfo: More info needed
    Presenting --> Searching: User wants different options

This state machine approach provides:

  1. Predictability: Clear transitions between states
  2. Debuggability: Easy to see where the agent is and how it got there
  3. Recoverability: Can resume from any state after interruption

Implementing Basic State Management

Here’s a simple state container for a conversational agent:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional
from datetime import datetime

@dataclass
class Message:
role: str # "user" or "assistant"
content: str
timestamp: datetime = field(default_factory=datetime.now)

@dataclass
class AgentState:
"""Container for agent state across interactions"""

# Conversation memory
messages: List[Message] = field(default_factory=list)

# Task tracking
current_step: str = "initial"
collected_data: Dict[str, Any] = field(default_factory=dict)

# Session info
session_id: str = ""
created_at: datetime = field(default_factory=datetime.now)

def add_message(self, role: str, content: str):
self.messages.append(Message(role=role, content=content))

def get_conversation_context(self, max_messages: int = 10) -> str:
"""Get recent conversation as context string"""
recent = self.messages[-max_messages:]
return "\n".join(
f"{m.role}: {m.content}" for m in recent
)

def update_data(self, key: str, value: Any):
self.collected_data[key] = value

def transition_to(self, new_step: str):
self.current_step = new_step

Stateful Agent Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from openai import OpenAI

client = OpenAI()

class StatefulAgent:
def __init__(self, persona: str):
self.persona = persona
self.state = AgentState()

def process(self, user_input: str) -> str:
# Add user message to state
self.state.add_message("user", user_input)

# Build prompt with conversation context
context = self.state.get_conversation_context()

system_prompt = f"""
{self.persona}

Current conversation state: {self.state.current_step}
Collected information: {self.state.collected_data}

Conversation history:
{context}
"""

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_input}
],
temperature=0.7
)

assistant_response = response.choices[0].message.content

# Add response to state
self.state.add_message("assistant", assistant_response)

return assistant_response


# Usage
agent = StatefulAgent(
persona="You are a helpful travel booking assistant."
)

print(agent.process("I want to fly to Tokyo"))
# "Great! When would you like to travel to Tokyo?"

print(agent.process("Next month, around the 15th"))
# "Perfect, I'll look for flights to Tokyo around the 15th of next month..."
# The agent remembers the destination from the previous turn

Short-Term vs Long-Term Memory

Agent memory operates at different timescales:

Type Scope Examples Persistence
Short-term Current session Conversation history, task progress Session duration
Long-term Across sessions User preferences, past interactions Database/file

Short-Term Memory

Short-term memory lives within the current conversation. It’s typically implemented as an in-memory data structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class ConversationMemory:
def __init__(self, max_tokens: int = 4000):
self.messages: List[Dict] = []
self.max_tokens = max_tokens

def add(self, role: str, content: str):
self.messages.append({"role": role, "content": content})
self._trim_if_needed()

def _trim_if_needed(self):
"""Remove oldest messages if context is too long"""
while self._estimate_tokens() > self.max_tokens:
if len(self.messages) > 2: # Keep at least system + last exchange
self.messages.pop(1) # Remove oldest non-system message
else:
break

def _estimate_tokens(self) -> int:
# Rough estimate: 4 chars per token
total_chars = sum(len(m["content"]) for m in self.messages)
return total_chars // 4

def get_messages(self) -> List[Dict]:
return self.messages.copy()

Long-Term Memory

Long-term memory persists across sessions. Common approaches include:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import json
from pathlib import Path

class PersistentMemory:
def __init__(self, user_id: str, storage_path: str = "./memory"):
self.user_id = user_id
self.path = Path(storage_path) / f"{user_id}.json"
self.data = self._load()

def _load(self) -> Dict:
if self.path.exists():
return json.loads(self.path.read_text())
return {"preferences": {}, "facts": [], "history_summary": ""}

def save(self):
self.path.parent.mkdir(exist_ok=True)
self.path.write_text(json.dumps(self.data, indent=2))

def add_preference(self, key: str, value: Any):
self.data["preferences"][key] = value
self.save()

def add_fact(self, fact: str):
if fact not in self.data["facts"]:
self.data["facts"].append(fact)
self.save()

def get_context(self) -> str:
"""Get long-term memory as context for prompts"""
prefs = self.data["preferences"]
facts = self.data["facts"]

context_parts = []
if prefs:
context_parts.append(f"User preferences: {prefs}")
if facts:
context_parts.append(f"Known facts: {facts}")

return "\n".join(context_parts)

State Machines with LangGraph

LangGraph provides a powerful framework for building stateful agent workflows as graphs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
from operator import add

# Define state schema
class TravelState(TypedDict):
messages: Annotated[list, add] # Accumulates messages
destination: str
dates: str
passengers: int
current_step: str

# Define node functions
def greeting_node(state: TravelState) -> dict:
return {
"messages": ["Welcome! Where would you like to travel?"],
"current_step": "collecting_destination"
}

def collect_destination(state: TravelState) -> dict:
# Extract destination from last message
last_message = state["messages"][-1]
# In real implementation, use LLM to extract
return {
"destination": "Tokyo", # Extracted value
"messages": ["Great choice! When would you like to travel?"],
"current_step": "collecting_dates"
}

def collect_dates(state: TravelState) -> dict:
return {
"dates": "2025-03-15",
"messages": ["How many passengers?"],
"current_step": "collecting_passengers"
}

def search_flights(state: TravelState) -> dict:
# Search logic here
return {
"messages": [f"Found 5 flights to {state['destination']}"],
"current_step": "presenting_results"
}

# Build the graph
def build_travel_agent():
workflow = StateGraph(TravelState)

# Add nodes
workflow.add_node("greeting", greeting_node)
workflow.add_node("collect_destination", collect_destination)
workflow.add_node("collect_dates", collect_dates)
workflow.add_node("search_flights", search_flights)

# Add edges
workflow.set_entry_point("greeting")
workflow.add_edge("greeting", "collect_destination")
workflow.add_edge("collect_destination", "collect_dates")
workflow.add_edge("collect_dates", "search_flights")
workflow.add_edge("search_flights", END)

return workflow.compile()

Conditional Routing

Real workflows often need conditional transitions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def should_continue(state: TravelState) -> str:
"""Determine next step based on state"""
if not state.get("destination"):
return "collect_destination"
if not state.get("dates"):
return "collect_dates"
if not state.get("passengers"):
return "collect_passengers"
return "search_flights"

# Add conditional edge
workflow.add_conditional_edges(
"process_input",
should_continue,
{
"collect_destination": "collect_destination",
"collect_dates": "collect_dates",
"collect_passengers": "collect_passengers",
"search_flights": "search_flights"
}
)

Checkpointing: Saving and Restoring State

Checkpointing allows agents to save state and resume later - essential for long-running tasks or handling interruptions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from langgraph.checkpoint.memory import MemorySaver

# Create checkpointer
checkpointer = MemorySaver()

# Compile graph with checkpointing
app = workflow.compile(checkpointer=checkpointer)

# Run with thread ID for state persistence
config = {"configurable": {"thread_id": "user-123-session-456"}}

# First interaction
result = app.invoke(
{"messages": ["I want to go to Paris"]},
config
)

# Later... resume from checkpoint
result = app.invoke(
{"messages": ["March 20th"]},
config # Same thread_id - continues from saved state
)

Persistent Checkpointing

For production, use database-backed checkpointing:

1
2
3
4
5
6
7
8
from langgraph.checkpoint.sqlite import SqliteSaver

# SQLite for development
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")

# For production, use PostgreSQL or Redis
# from langgraph.checkpoint.postgres import PostgresSaver
# checkpointer = PostgresSaver.from_conn_string(os.environ["DATABASE_URL"])

Managing Context Windows

LLMs have limited context windows. As conversations grow, you need strategies to manage what fits:

Sliding Window

Keep only the most recent N messages:

1
2
3
4
5
def sliding_window(messages: List[Dict], window_size: int = 10) -> List[Dict]:
if len(messages) <= window_size:
return messages
# Always keep system message + recent messages
return [messages[0]] + messages[-(window_size-1):]

Summarization

Periodically summarize older messages:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def summarize_and_compress(messages: List[Dict], threshold: int = 20) -> List[Dict]:
if len(messages) <= threshold:
return messages

# Summarize older messages
old_messages = messages[1:-5] # Keep system + last 5
summary_prompt = f"Summarize this conversation concisely:\n{old_messages}"

summary = call_llm(summary_prompt)

# Return compressed history
return [
messages[0], # System message
{"role": "system", "content": f"Previous conversation summary: {summary}"},
*messages[-5:] # Recent messages
]

Selective Retention

Keep only important information:

1
2
3
4
5
6
7
8
9
10
11
12
def extract_key_facts(messages: List[Dict]) -> Dict:
"""Use LLM to extract key facts worth remembering"""
prompt = """
Extract key facts from this conversation that should be remembered:
- User preferences
- Important decisions made
- Pending actions
- Critical information

Return as JSON.
"""
return call_llm_json(prompt, messages)

Practical Pattern: Session Manager

Here’s a complete session management pattern:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import uuid
from datetime import datetime, timedelta

class SessionManager:
def __init__(self, timeout_minutes: int = 30):
self.sessions: Dict[str, AgentState] = {}
self.timeout = timedelta(minutes=timeout_minutes)

def get_or_create_session(self, session_id: str = None) -> tuple[str, AgentState]:
"""Get existing session or create new one"""
if session_id and session_id in self.sessions:
state = self.sessions[session_id]
# Check if session expired
if datetime.now() - state.created_at < self.timeout:
return session_id, state

# Create new session
new_id = str(uuid.uuid4())
self.sessions[new_id] = AgentState(session_id=new_id)
return new_id, self.sessions[new_id]

def save_session(self, session_id: str, state: AgentState):
self.sessions[session_id] = state

def cleanup_expired(self):
"""Remove expired sessions"""
now = datetime.now()
expired = [
sid for sid, state in self.sessions.items()
if now - state.created_at > self.timeout
]
for sid in expired:
del self.sessions[sid]

Key Takeaways

  1. State enables continuity: Without state, every interaction is isolated and context is lost
  2. Model state as a graph: State machines provide clear structure for complex workflows
  3. Separate memory timescales: Short-term for current session, long-term for persistent knowledge
  4. Checkpoint for resilience: Save state to recover from interruptions
  5. Manage context actively: Use sliding windows or summarization to stay within token limits

State management transforms agents from forgetful responders into coherent collaborators that remember, learn, and adapt. In the next post, I’ll explore how agents connect to external systems - databases, APIs, and the wider world.


This is Part 9 of my series on building intelligent AI systems. Next: connecting agents to external APIs and data sources.

Extending Agents with Tools and Structured Outputs Connecting Agents to the World - External APIs and Data

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×