Agents become truly useful when they can interact with the real world—fetching live data from APIs, querying databases, and writing results back. This post covers building production-grade integrations: API tools with proper error handling, SQL agents that translate natural language to queries, and security patterns that prevent disasters.
The Grounding Problem
Large language models suffer from a fundamental limitation: they’re trained on static snapshots of text. GPT-4’s training data has a cutoff date. Claude doesn’t know your company’s order database exists. No model can tell you tomorrow’s weather—they can only generate plausible-sounding text about weather.
This creates what we might call the grounding problem: LLMs produce confident answers based on pattern matching, not factual lookup. When a user asks “What’s the status of my order?”, a pure LLM can only hallucinate an answer.
graph LR
A[User Query] --> B[Agent]
B --> C{Needs Data?}
C -->|Yes| D[External System]
D --> E[API / Database]
E --> F[Real-time Data]
F --> B
C -->|No| G[Response]
B --> G
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
class A,G greenClass
class B,C orangeClass
class D,E,F blueClass
External integrations solve this by giving agents access to live data. The agent recognizes it needs current information, calls the appropriate tool, and grounds its response in actual facts.
| Integration Type | Purpose | Examples |
|---|---|---|
| REST APIs | Live external data | Weather, stocks, news, third-party services |
| SQL Databases | Structured business data | Orders, customers, inventory, analytics |
| Vector Databases | Semantic search | Document retrieval, knowledge bases, RAG |
| Internal Services | Organization-specific systems | CRM, ticketing, authentication |
API Integration: Beyond Simple HTTP Calls
Building a production API tool requires more than requests.get(). Consider what happens when:
- The API is temporarily down
- Rate limits kick in mid-conversation
- Network latency causes timeouts
- The response format changes unexpectedly
Each failure mode needs handling, and the agent needs to understand what went wrong.
The Anatomy of a Robust API Tool
A production-grade API tool has three layers:
- The HTTP client handles low-level concerns: retries, timeouts, connection pooling
- The tool wrapper exposes a clean interface with typed parameters and documentation
- Error translation converts HTTP failures into messages the LLM can understand
1 | from langchain_core.tools import tool |
The tenacity library’s @retry decorator implements exponential backoff—waiting 2, 4, then 8 seconds between retries. This pattern respects rate limits without failing immediately on transient errors.
Error Messages for LLMs
When an API call fails, the error message matters. The LLM will see this message and decide how to proceed. Compare:
Unhelpful: "HTTPError: 429"
Helpful: "Rate limit exceeded. The weather service allows 60 requests per minute. Please wait a moment before trying again."
The second version gives the LLM context to either wait or inform the user appropriately. Design error messages as if explaining to a colleague who needs to handle the problem.
Text-to-SQL: Power and Peril
SQL agents represent both the highest potential and highest risk of LLM integrations. A well-designed SQL agent can answer complex analytical questions in natural language. A poorly designed one can DROP TABLE users based on a misunderstood prompt.
Why Text-to-SQL is Valuable
Consider the alternative: every business question requires a developer to write a query. “What were our top-selling products last quarter?” becomes a ticket, waits in a backlog, gets implemented, and returns results days later.
A SQL agent answers immediately. The user asks in natural language; the agent generates SQL, executes it, and explains the results. This democratizes data access without requiring everyone to learn SQL.
The Security Imperative
But LLMs are eager to please. If prompted cleverly—or accidentally—they might generate destructive queries. The agent doesn’t inherently know that DELETE FROM orders is catastrophic while SELECT FROM orders is safe.
Defense in depth applies here:
- Query validation: Parse and reject non-SELECT statements
- Parameterized templates: Use allowed query patterns, not arbitrary generation
- Row-level security: Filter results by user permissions
- Read-only connections: Database credentials should only allow SELECT
- Audit logging: Record every query for review
The Text-to-SQL Pipeline
A safe SQL agent follows a structured pipeline:
graph TD
A[START] --> B[Generate SQL]
B --> C[Validate Query]
C --> D{Safe?}
D -->|Yes| E[Execute Query]
D -->|No| F[Explain Rejection]
E --> G[Format Results]
F --> G
G --> H[END]
classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff
classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
classDef greenClass fill:#27AE60,stroke:#333,stroke-width:2px,color:#fff
classDef redClass fill:#E74C3C,stroke:#333,stroke-width:2px,color:#fff
class A,H greenClass
class B,E,G blueClass
class C,D orangeClass
class F redClass
The validation step is critical. Using a SQL parser like sqlparse, you can inspect the query structure:
1 | import sqlparse |
This validation isn’t foolproof—determined attackers find creative bypasses. But it catches the common cases: accidental destructive queries and basic injection attempts.
Schema as Context
LLMs can’t generate valid SQL without knowing the schema. The query “How many orders last month?” could target orders, purchase_orders, sales_transactions, or something else entirely.
LangChain’s SQLDatabase utility provides schema extraction:
1 | from langchain_community.utilities import SQLDatabase |
Include this schema in the LLM prompt. The model needs to know what tables exist, their columns, and relationships. For large schemas, consider selective inclusion—only show tables relevant to the current query.
Combining Data Sources
Real-world agents often need multiple sources. A customer service agent might check:
- The database for order history
- An API for shipping carrier status
- Another API for weather affecting delivery
The Routing Pattern
Rather than always querying everything, intelligent routing examines the question and selects appropriate sources:
1 | def route_query(state: HybridAgentState) -> dict: |
This heuristic approach works for clear cases. For ambiguous queries, you might use the LLM itself to classify the query type before routing.
Result Synthesis
When data comes from multiple sources, the agent must synthesize coherently. A shipping status query might combine:
- Database: Order details, items, customer info
- Carrier API: Current location, estimated delivery
- Weather API: Conditions at delivery address
The synthesis step presents these as a unified narrative, not disconnected data dumps.
Audit and Observability
Every external call should be logged. This serves multiple purposes:
- Debugging: When something fails, you can trace exactly what happened
- Compliance: Regulated industries require audit trails
- Cost tracking: API calls cost money; know where it goes
- Security: Detect unusual patterns that might indicate misuse
1 | import logging |
In production, these logs flow to centralized systems (Datadog, Splunk, CloudWatch) where you can search, alert, and analyze patterns.
Key Takeaways
External integration solves the grounding problem: LLMs can’t know live data—tools bridge this gap with real-time information.
API tools need resilience: Implement retries with exponential backoff, meaningful error messages, and timeout handling. APIs fail; your agent shouldn’t.
Text-to-SQL requires defense in depth: Query validation, read-only connections, row-level security, and audit logging. Never trust LLM-generated SQL without verification.
Schema context enables valid queries: LLMs need to see database structure to generate correct SQL. Include relevant schema in prompts.
Route queries intelligently: Not every question needs every source. Classify queries and fetch only relevant data.
Audit everything: Log all external operations for debugging, compliance, and security analysis.
Next: Agentic RAG and Human-in-the-Loop with LangGraph - We’ll explore dynamic retrieval strategies, self-correcting RAG, and adding human approval to sensitive agent workflows.
Comments