Agents become truly useful when they stop just reasoning and start acting - interacting with real systems like market data feeds, financial news sources, and internal databases. This shift happens through three key integration patterns: external APIs for real-time data, web search for unstructured information, and database connections for structured internal data. Mastering these patterns transforms agents from conversational tools into operational financial systems.
Why External Data Matters
An LLM alone can’t answer “What’s Apple’s current stock price?” or “What are today’s best credit card offers?” - this information isn’t in its training data and changes constantly. Financial agents need access to:
Real-time market data: Stock prices, FX rates, bond yields
Current news and research: Earnings reports, regulatory filings, market analysis
flowchart LR
USER[User Query] --> AGENT[Financial Agent]
AGENT --> API[Market APIs]
AGENT --> SEARCH[Web Search]
AGENT --> DB[Databases]
API --> AGENT
SEARCH --> AGENT
DB --> AGENT
AGENT --> RESPONSE[Informed Response]
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
class AGENT blueClass
External APIs: Real-Time Market Data
APIs (Application Programming Interfaces) are structured ways for software systems to communicate. For financial agents, APIs provide access to live market data, pricing information, and execution capabilities.
The API Pattern
flowchart LR
A[Agent] --> |HTTP Request| B[API Endpoint]
B --> |JSON Response| A
A --> C[Process & Respond]
Think of it like placing an order at a restaurant: the agent formats a request, sends it to the API (the kitchen), and receives a response (the meal).
Building a Stock Price Agent
Let’s build an agent that fetches live stock prices and calculates investment costs:
defextract_investment_info(self, user_input: str) -> InvestmentRequest: """Extract investment details from natural language using LLM"""
prompt = f"""Extract investment information from this input: "{user_input}" Return JSON with these fields: - ticker: stock symbol (string or null) - shares: number of shares (integer or null) - broker_fee_pct: broker fee percentage (number or null) - commission_per_trade: flat commission (number or null) Return only valid JSON."""
Not all information lives in structured APIs. Credit card offers, analyst opinions, regulatory news - these exist in unstructured web content. Web search agents retrieve and synthesize this information.
Why Web Search?
When asked “What are the best no-fee credit cards right now?”, the answer exists across news sites, comparison sites, and financial blogs. Web search enables:
defsearch_web(self, query: str, num_results: int = 5) -> List[SearchResult]: """Search web using Tavily API""" response = tavily_client.search( query=query, search_depth="advanced", max_results=num_results )
results = [] for item in response.get("results", []): results.append(SearchResult( title=item["title"], url=item["url"], snippet=item["content"], domain=self._extract_domain(item["url"]) )) return results
deffilter_reputable_sources( self, results: List[SearchResult] ) -> List[SearchResult]: """Keep only results from trusted financial sources""" return [ r for r in results if r.domain inself.reputable_domains ]
defextract_card_data( self, results: List[SearchResult] ) -> List[CreditCardRecord]: """Use LLM to extract structured card data from snippets"""
combined_text = "\n\n".join([ f"Source: {r.title}\nContent: {r.snippet}" for r in results ])
prompt = f"""Extract credit card information from these sources: {combined_text} For each card mentioned, extract: - card_name: Full name of the card - issuer: Bank or company - rewards_rate: Main rewards rate - annual_fee: Annual fee amount - bonus_offer: Sign-up bonus if mentioned - best_for: What spending category it's best for Return as JSON array."""
cards_json = json.dumps([c.__dict__ for c in cards[:5]], indent=2)
prompt = f"""Create a credit card comparison based on: User query: {user_query} Cards found: {cards_json} Provide: 1. Brief intro (2-3 sentences) 2. Key takeaway about best choice 3. Standard disclaimer about rates changing Return as JSON with fields: intro, takeaway, disclaimer"""
return CardComparison( intro=summary["intro"], top_cards=cards[:5], takeaway=summary["takeaway"], sources=[{"title": c.source_url} for c in cards[:5]], disclaimer=summary["disclaimer"] )
response = f"""Based on {source.title}: {summary} Source: {source.url}"""
Recency Filtering: Prefer recent content
1 2 3
deffilter_recent(results: List[SearchResult], days: int = 30): cutoff = datetime.now() - timedelta(days=days) return [r for r in results if r.published_date > cutoff]
Database Agents: Structured Internal Data
For questions about internal systems (“What’s the status of ticket #3042?” or “How many users signed up this week?”), agents need database access. Text-to-SQL enables natural language queries against structured data.
The Text-to-SQL Pattern
flowchart LR
Q[Natural Language
Question] --> GEN[Generate SQL]
GEN --> VAL[Validate &
Sanitize]
VAL --> EXEC[Execute Query]
EXEC --> SUM[Summarize
Results]
SUM --> R[User Response]
classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
class VAL orangeClass
error_context = "" if previous_error: error_context = f"\nPrevious attempt failed: {previous_error}\nPlease correct the error."
prompt = f"""Convert this question to SQL: Question: {question} Database Schema: {self.get_schema_description()} Rules: 1. Use only SELECT statements 2. Always include LIMIT unless counting 3. Use proper JOINs based on relationships 4. Handle date/time with appropriate functions {error_context} Return ONLY the SQL query, no explanation."""
# 1. Only allow SELECT ifnot sql_upper.startswith('SELECT'): return ( "SELECT 'Error: Only SELECT queries allowed' as error;", ["Query rejected - only SELECT allowed"] )
# 2. Block dangerous keywords forbidden = ['INSERT', 'UPDATE', 'DELETE', 'DROP', 'CREATE', 'ALTER'] for keyword in forbidden: if keyword in sql_upper: return ( f"SELECT 'Error: {keyword} not allowed' as error;", [f"Query rejected - {keyword} not allowed"] )
# 3. Ensure LIMIT is present if'LIMIT'notin sql_upper: sql = sql.rstrip(';') + ' LIMIT 100;' assumptions.append("Added LIMIT 100 for performance")
return sql, assumptions
defexecute_query(self, sql: str) -> Tuple[pd.DataFrame, int]: """Execute query and return results""" withself.engine.connect() as conn: result = conn.execute(text(sql)) df = pd.DataFrame(result.fetchall(), columns=result.keys()) return df, len(df)
defgenerate_summary( self, question: str, sql: str, data: pd.DataFrame, assumptions: List[str] ) -> str: """Generate natural language summary of results"""
data_preview = data.head(10).to_string()
prompt = f"""Summarize these database query results: Original Question: {question} SQL Executed: {sql} Assumptions Made: {assumptions} Results ({len(data)} rows): {data_preview} Provide a 2-3 sentence summary answering the original question."""
# Query 1: Aggregation result = agent.process_question( "How many transactions were processed this week?" ) print(result.summary) # "This week saw 1,247 transactions processed, with a total # value of $2.3M. Most were standard debits and credits."
# Query 2: Joining tables result = agent.process_question( "Show me the top 5 customers by transaction volume" ) print(result.data)
# Safety test: Blocked operation result = agent.process_question( "Delete all failed transactions" ) print(result.summary) # "Error: DELETE operations not allowed"
Database Security Essentials
Read-Only Access: Use restricted database credentials
This is the tenth post in my Applied Agentic AI for Finance series. Next: RAG and Evaluation for Financial Agents where we’ll explore retrieval-augmented generation and performance metrics.
Comments