LLMs are stateless by nature - each interaction is isolated, with no memory of prior prompts. But financial agents often need context to manage complex, multi-step tasks like loan approvals, insurance claims, or trading workflows. This requires two complementary mechanisms: state for tracking execution progress, and memory for maintaining conversational context across interactions.
The Stateless Problem
Consider a loan approval agent that needs to:
Verify applicant documents
Get AI risk assessment
Make final approval decision
In a stateless system, each step is isolated. The risk assessment step doesn’t know what documents were verified. The decision step can’t access the risk assessment. Information is lost between stages, making the system brittle and unreliable.
flowchart TB
subgraph Stateless["Stateless: Information Lost"]
direction LR
S1[Step 1 Verify Docs] -.-> |Lost| S2[Step 2 Risk Check]
S2 -.-> |Lost| S3[Step 3 Decision]
end
subgraph Stateful["Stateful: Context Preserved"]
direction LR
C[Shared Context] --> A1[Step 1]
C --> A2[Step 2]
C --> A3[Step 3]
A1 --> C
A2 --> C
A3 --> C
end
classDef pinkClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff
classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
class Stateless pinkClass
class Stateful greenClass
Agent State: The Working Memory
Agent state includes everything the agent knows during one execution:
The original user input
System instructions
Message history
Tool calls (pending or completed)
Intermediate results from prior steps
This state is ephemeral - it only exists while the task is running. Think of it like working memory: tracking progress and context throughout execution, then cleared when the task ends.
State Machines for Agent Execution
Agents can be modeled as state machines that transition through defined steps:
defassess(self): """Step 2: AI risk assessment""" print("2️⃣ ASSESSMENT: AI analyzing claim...")
prompt = f"""Assess the risk of this insurance claim: Customer: {self.context.customer} Claim Type: {self.context.claim_type} Amount: ${self.context.amount:,.0f} Description: {self.context.description} Provide risk level (LOW/MEDIUM/HIGH) and key red flags."""
# Use data from previous steps is_new_customer = "new customer"inself.context.description.lower() is_high_amount = self.context.amount > 25000 is_high_fraud = self.context.fraud_score > 0.25
print(f" • New customer: {is_new_customer}") print(f" • High amount: {is_high_amount}") print(f" • High fraud risk: {is_high_fraud}")
self.context.transition(ClaimState.DECISION)
defdecide(self): """Step 4: Final decision using all context""" print("4️⃣ DECISION: Making approval decision...")
# Combine insights from all previous steps ifself.context.fraud_score > 0.25: self.context.final_decision = "DENY" elifself.context.amount > 25000and"new"inself.context.description.lower(): self.context.final_decision = "REQUEST MORE DOCUMENTATION" else: self.context.final_decision = "APPROVE"
While both provide context, state and memory serve different purposes:
Aspect
State
Short-Term Memory
Long-Term Memory
Scope
Single execution
Single session
Across sessions
Duration
Task lifetime
Conversation lifetime
Persistent
Tracked By
Execution ID
Session ID
User ID
Contains
Tool calls, transitions
Conversation turns
User preferences
Use Case
Workflow steps
Dialogue continuity
Personalization
flowchart TD
subgraph State["State (Execution)"]
E1[Step 1] --> E2[Step 2]
E2 --> E3[Step 3]
end
subgraph STM["Short-Term Memory (Session)"]
T1[Turn 1] --> T2[Turn 2]
T2 --> T3[Turn 3]
end
subgraph LTM["Long-Term Memory (User)"]
S1[Session 1] --> S2[Session 2]
S2 --> S3[Session 3]
end
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
class State blueClass
class STM orangeClass
class LTM greenClass
Short-Term Memory: Simulating Continuity
LLMs don’t have real memory - what looks like memory is constructed by including previous interactions in each prompt. This creates the illusion of continuity.
Memory Strategies
1. Full Conversation History Send all previous messages with each new input:
1 2 3 4 5 6 7
messages = [ {"role": "user", "content": "I'm 35 years old..."}, {"role": "assistant", "content": "Great! Let me help..."}, {"role": "user", "content": "I have $50K saved..."}, {"role": "assistant", "content": "With your savings..."}, {"role": "user", "content": "What about bonds?"} # Current turn ]
Pros: Full context preserved Cons: Token-heavy, may exceed context window
2. Sliding Window Keep only the most recent N turns:
defget_captured_info(self) -> Dict: """Return non-empty profile fields""" return { field: value for field, value in asdict(self).items() if value and (notisinstance(value, list) orlen(value) > 0) }
defupdate_session_summary(self, client_input: str): """Update running summary of client's situation"""
current = "\n".join(f"- {fact}"for fact inself.session_summary)
prompt = f"""Current session summary: {current if current else'None yet'} Latest client input: "{client_input}" Update summary to include new information. Keep 3-5 key facts about: - Client's financial goals and timeline - Risk preferences and concerns - Important personal circumstances Return only bullet points, one per line."""
response = llm.complete(prompt, temperature=0.3)
self.session_summary = [ line.strip() for line in response.strip().split('\n') if line.strip() ][-5:] # Keep max 5 facts
defextract_profile(self, client_input: str): """Extract structured profile from client input"""
prompt = f"""Extract client financial profile from: "{client_input}" Current profile: {json.dumps(self.profile.get_captured_info(), indent=2)} Extract and return ONLY new information as valid JSON: {{ "age": number or null, "risk_tolerance": "conservative|moderate|aggressive" or null, "annual_income": string or null, "investment_goals": list of strings or null, "time_horizon": string or null }} Return valid JSON only."""
# Update profile with extracted values for field, value in updates.items(): if value isnotNoneandhasattr(self.profile, field): setattr(self.profile, field, value) except Exception as e: print(f"Profile extraction error: {e}")
defgenerate_response(self, client_input: str) -> str: """Generate contextual response using all memory"""
prompt = f"""You are a professional financial advisor assistant. {memory_context} {conversation_context} Client profile captured: {json.dumps(captured, indent=2) if captured else'None yet'} Client just said: "{client_input}" Guidelines: 1. Provide helpful, personalized financial guidance 2. Reference previous context naturally 3. NEVER ask for information you already have 4. Be concise and professional Respond naturally as a financial advisor:"""
return llm.complete(prompt, temperature=0.7)
defprocess_input(self, client_input: str) -> str: """Process input through full memory pipeline"""
conversation = [ "Hi, I'm 35 years old and want to start investing for retirement. I have about $50K saved up.", "I'm pretty nervous about the stock market. I'd say I'm conservative with risk.", "My main goal is retirement in 30 years, but I also want to buy a house in 5 years.", "What do you recommend for someone like me?", ]
print("💼 FINANCIAL ADVISOR CONVERSATION")
for i, client_input inenumerate(conversation, 1): print(f"\n👤 CLIENT (Turn {i}): {client_input}")
# Show memory state every 2 turns if i % 2 == 0: print("\n🧠 MEMORY STATE:") print(f" Summary: {assistant.session_summary}") print(f" Profile: {list(assistant.profile.get_captured_info().keys())}") print(f" History: {len(assistant.conversation_history)} turns")
run_advisor_demo()
The advisor remembers everything: age (35), savings ($50K), risk tolerance (conservative), goals (retirement + house), and timeline (30 years, 5 years). No repeated questions, just personalized advice building on accumulated context.
defretrieve(self, key: str) -> Optional[Any]: if key notinself.data: returnNone
value, stored_at = self.data[key]
# Check retention age = (datetime.now() - stored_at).total_seconds() if age > self.retention_seconds: delself.data[key] returnNone
return value
defclear_sensitive(self): """Clear all sensitive data""" sensitive_keys = [k for k inself.data.keys() if"ssn"in k.lower() or"account"in k.lower()] for key in sensitive_keys: delself.data[key]
Takeaways
LLMs are stateless - state and memory must be explicitly managed to maintain context across interactions
State tracks execution progress - use state machines with context objects for multi-step workflows like loan approval or claims processing
Short-term memory simulates continuity - through full history, sliding windows, or summarization strategies
Multi-component memory works best - combine structured profiles, running summaries, and conversation history
Financial workflows need audit trails - every state transition should be logged with timestamps and reasons
Memory enables personalization - agents can provide context-aware, non-repetitive responses by remembering what clients have shared
This is the ninth post in my Applied Agentic AI for Finance series. Next: Connecting Agents to Financial Data Sources where we’ll explore integrating external APIs, web search, and databases.
Comments