Connecting Agents to the World - External APIs and Data

Nov 30 2025 AI agentic-ai

An agent that can only process text is fundamentally limited. Real usefulness comes from connecting to external systems - fetching live data, querying databases, calling APIs, and triggering actions in the real world. In this post, I’ll explore how to build these connections, turning isolated language models into integrated systems that can actually get things done.

The Integration Challenge

LLMs excel at understanding and generating text, but they operate in a vacuum. They don’t know:

Today’s weather or stock prices
Your company’s current inventory
The user’s calendar events
What happened after their training cutoff

Tools bridge this gap, but designing reliable integrations requires careful thought about authentication, error handling, rate limiting, and data transformation.

flowchart LR
    A[Agent] --> T{Tool Layer}
    T --> W[Web Search]
    T --> D[Databases]
    T --> API[External APIs]
    T --> S[Services]

    W --> R[Results]
    D --> R
    API --> R
    S --> R

    R --> A

    classDef orangeClass fill:#F39C12,stroke:#333,stroke-width:2px,color:#fff
    classDef blueClass fill:#4A90E2,stroke:#333,stroke-width:2px,color:#fff

    class A orangeClass
    class T blueClass

Web Search Integration

Web search gives agents access to current information beyond their training data. Here’s how to integrate search capabilities:

from langchain_community.tools import TavilySearchResults
from langchain_core.tools import tool

# Using Tavily (optimized for LLM use)
search_tool = TavilySearchResults(max_results=5)

@tool
def web_search(query: str) -> str:
    """
    Search the web for current information.

    Args:
        query: The search query

    Returns:
        Search results as formatted text
    """
    results = search_tool.invoke({"query": query})

    # Format results for the LLM
    formatted = []
    for result in results:
        formatted.append(f"Title: {result['title']}")
        formatted.append(f"URL: {result['url']}")
        formatted.append(f"Content: {result['content'][:500]}")
        formatted.append("---")

    return "\n".join(formatted)

Building a Research Agent

from openai import OpenAI
import json

client = OpenAI()

class ResearchAgent:
    def __init__(self):
        self.tools = [web_search]
        self.tool_map = {t.name: t for t in self.tools}

    def research(self, topic: str) -> str:
        messages = [
            {
                "role": "system",
                "content": """You are a research assistant. Use web search
                to find current, accurate information. Always cite sources."""
            },
            {"role": "user", "content": f"Research this topic: {topic}"}
        ]

        # Tool execution loop
        while True:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
                tools=self._get_tool_schemas(),
                temperature=0
            )

            message = response.choices[0].message

            if not message.tool_calls:
                return message.content

            messages.append(message)

            for tool_call in message.tool_calls:
                func_name = tool_call.function.name
                func_args = json.loads(tool_call.function.arguments)

                result = self.tool_map[func_name].invoke(func_args)

                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": str(result)
                })

    def _get_tool_schemas(self) -> list:
        return [
            {
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": tool.args_schema.schema()
                }
            }
            for tool in self.tools
        ]

Database Connections

Agents often need to query structured data. Here’s a pattern for SQL database access:

import sqlite3
from typing import List, Dict, Any

@tool
def query_database(sql: str) -> str:
    """
    Execute a read-only SQL query against the database.

    Args:
        sql: SELECT query to execute (no modifications allowed)

    Returns:
        Query results as formatted text
    """
    # Safety check - only allow SELECT
    if not sql.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed"

    try:
        conn = sqlite3.connect("company.db")
        cursor = conn.cursor()
        cursor.execute(sql)
        columns = [desc[0] for desc in cursor.description]
        rows = cursor.fetchall()
        conn.close()

        # Format as readable table
        result = " | ".join(columns) + "\n"
        result += "-" * 40 + "\n"
        for row in rows:
            result += " | ".join(str(v) for v in row) + "\n"

        return result

    except Exception as e:
        return f"Query error: {str(e)}"


@tool
def get_schema() -> str:
    """
    Get the database schema to understand available tables and columns.

    Returns:
        Schema description
    """
    conn = sqlite3.connect("company.db")
    cursor = conn.cursor()

    cursor.execute(
        "SELECT name FROM sqlite_master WHERE type='table'"
    )
    tables = cursor.fetchall()

    schema = []
    for (table_name,) in tables:
        cursor.execute(f"PRAGMA table_info({table_name})")
        columns = cursor.fetchall()
        col_info = [f"{col[1]} ({col[2]})" for col in columns]
        schema.append(f"{table_name}: {', '.join(col_info)}")

    conn.close()
    return "\n".join(schema)

Natural Language to SQL

Enable agents to translate natural language to SQL:

class DatabaseAgent:
    def __init__(self):
        self.tools = [query_database, get_schema]

    def query(self, question: str) -> str:
        messages = [
            {
                "role": "system",
                "content": """You are a database analyst. Convert natural language
                questions into SQL queries. Always check the schema first.

                Steps:
                1. Use get_schema to understand available tables
                2. Write appropriate SQL query
                3. Execute with query_database
                4. Summarize results for the user"""
            },
            {"role": "user", "content": question}
        ]

        return self._execute_with_tools(messages)

REST API Integration

Many services expose REST APIs. Here’s a robust pattern for API integration:

import requests
from typing import Optional
from functools import wraps
import time

def with_retry(max_attempts: int = 3, backoff_factor: float = 2.0):
    """Decorator for retry logic with exponential backoff"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            last_exception = None
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except requests.RequestException as e:
                    last_exception = e
                    if attempt < max_attempts - 1:
                        sleep_time = backoff_factor ** attempt
                        time.sleep(sleep_time)
            raise last_exception
        return wrapper
    return decorator


class APIClient:
    def __init__(self, base_url: str, api_key: str):
        self.base_url = base_url
        self.session = requests.Session()
        self.session.headers.update({
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        })

    @with_retry(max_attempts=3)
    def get(self, endpoint: str, params: dict = None) -> dict:
        response = self.session.get(
            f"{self.base_url}/{endpoint}",
            params=params,
            timeout=30
        )
        response.raise_for_status()
        return response.json()

    @with_retry(max_attempts=3)
    def post(self, endpoint: str, data: dict) -> dict:
        response = self.session.post(
            f"{self.base_url}/{endpoint}",
            json=data,
            timeout=30
        )
        response.raise_for_status()
        return response.json()


# Create tools from API client
weather_client = APIClient(
    base_url="https://api.weather.example.com",
    api_key=os.environ["WEATHER_API_KEY"]
)

@tool
def get_weather(city: str) -> dict:
    """
    Get current weather for a city.

    Args:
        city: City name

    Returns:
        Weather data including temperature and conditions
    """
    try:
        return weather_client.get("current", params={"city": city})
    except requests.RequestException as e:
        return {"error": f"Failed to fetch weather: {str(e)}"}

File System Operations

Agents may need to read and write files:

from pathlib import Path

@tool
def read_file(file_path: str) -> str:
    """
    Read contents of a text file.

    Args:
        file_path: Path to the file to read

    Returns:
        File contents
    """
    path = Path(file_path)

    # Security: restrict to allowed directories
    allowed_dirs = [Path("./data"), Path("./documents")]
    if not any(path.resolve().is_relative_to(d.resolve()) for d in allowed_dirs):
        return "Error: Access denied - path outside allowed directories"

    if not path.exists():
        return f"Error: File not found: {file_path}"

    try:
        return path.read_text()
    except Exception as e:
        return f"Error reading file: {str(e)}"


@tool
def write_file(file_path: str, content: str) -> str:
    """
    Write content to a text file.

    Args:
        file_path: Path to the file to write
        content: Content to write

    Returns:
        Success or error message
    """
    path = Path(file_path)

    # Security check
    allowed_dirs = [Path("./output")]
    if not any(path.resolve().is_relative_to(d.resolve()) for d in allowed_dirs):
        return "Error: Access denied - can only write to output directory"

    try:
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(content)
        return f"Successfully wrote to {file_path}"
    except Exception as e:
        return f"Error writing file: {str(e)}"

Building a Multi-Tool Agent

Combining multiple integrations into a capable agent:

class IntegratedAgent:
    def __init__(self):
        self.tools = [
            web_search,
            query_database,
            get_schema,
            get_weather,
            read_file,
            write_file
        ]
        self.tool_map = {t.name: t for t in self.tools}

    def run(self, task: str) -> str:
        messages = [
            {
                "role": "system",
                "content": """You are a capable assistant with access to multiple tools:

                - web_search: Find current information online
                - query_database/get_schema: Query company database
                - get_weather: Get weather information
                - read_file/write_file: Work with files

                Use tools when needed. Chain multiple tools for complex tasks.
                Always explain what you're doing and why."""
            },
            {"role": "user", "content": task}
        ]

        max_iterations = 10
        for _ in range(max_iterations):
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
                tools=self._get_tool_schemas(),
                temperature=0
            )

            message = response.choices[0].message

            if not message.tool_calls:
                return message.content

            messages.append(message)

            for tool_call in message.tool_calls:
                result = self._execute_tool(tool_call)
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result
                })

        return "Reached maximum iterations without completing task"

    def _execute_tool(self, tool_call) -> str:
        func_name = tool_call.function.name
        func_args = json.loads(tool_call.function.arguments)

        if func_name not in self.tool_map:
            return f"Unknown tool: {func_name}"

        try:
            result = self.tool_map[func_name].invoke(func_args)
            return str(result)
        except Exception as e:
            return f"Tool error: {str(e)}"

Security Considerations

External integrations introduce security risks. Essential safeguards:

Input Validation

from pydantic import BaseModel, validator

class QueryInput(BaseModel):
    sql: str

    @validator('sql')
    def validate_sql(cls, v):
        # Block dangerous keywords
        dangerous = ['DROP', 'DELETE', 'UPDATE', 'INSERT', 'ALTER', 'TRUNCATE']
        upper = v.upper()
        for keyword in dangerous:
            if keyword in upper:
                raise ValueError(f"Dangerous SQL keyword detected: {keyword}")
        return v

Rate Limiting

from functools import wraps
from collections import defaultdict
import time

class RateLimiter:
    def __init__(self, calls_per_minute: int = 60):
        self.calls_per_minute = calls_per_minute
        self.calls = defaultdict(list)

    def is_allowed(self, key: str) -> bool:
        now = time.time()
        minute_ago = now - 60

        # Clean old calls
        self.calls[key] = [t for t in self.calls[key] if t > minute_ago]

        if len(self.calls[key]) >= self.calls_per_minute:
            return False

        self.calls[key].append(now)
        return True

rate_limiter = RateLimiter(calls_per_minute=30)

def rate_limited(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        if not rate_limiter.is_allowed(func.__name__):
            raise Exception("Rate limit exceeded")
        return func(*args, **kwargs)
    return wrapper

Credential Management

import os
from dataclasses import dataclass

@dataclass
class APICredentials:
    """Securely manage API credentials"""

    @staticmethod
    def get(service: str) -> str:
        """Get credential from environment"""
        key = f"{service.upper()}_API_KEY"
        value = os.environ.get(key)
        if not value:
            raise ValueError(f"Missing credential: {key}")
        return value

# Usage in tools
weather_key = APICredentials.get("weather")  # Reads WEATHER_API_KEY

Error Handling Patterns

Robust error handling is critical for production agents:

from enum import Enum
from typing import Union

class ToolErrorType(Enum):
    NETWORK = "network"
    AUTHENTICATION = "authentication"
    RATE_LIMIT = "rate_limit"
    VALIDATION = "validation"
    NOT_FOUND = "not_found"
    UNKNOWN = "unknown"

@dataclass
class ToolResult:
    success: bool
    data: Any = None
    error_type: ToolErrorType = None
    error_message: str = None

    def to_string(self) -> str:
        if self.success:
            return str(self.data)
        return f"Error ({self.error_type.value}): {self.error_message}"


def safe_tool_execution(func):
    """Wrapper that catches exceptions and returns structured results"""
    @wraps(func)
    def wrapper(*args, **kwargs):
        try:
            result = func(*args, **kwargs)
            return ToolResult(success=True, data=result)
        except requests.Timeout:
            return ToolResult(
                success=False,
                error_type=ToolErrorType.NETWORK,
                error_message="Request timed out"
            )
        except requests.HTTPError as e:
            if e.response.status_code == 429:
                return ToolResult(
                    success=False,
                    error_type=ToolErrorType.RATE_LIMIT,
                    error_message="Rate limit exceeded, try again later"
                )
            elif e.response.status_code == 401:
                return ToolResult(
                    success=False,
                    error_type=ToolErrorType.AUTHENTICATION,
                    error_message="Authentication failed"
                )
        except Exception as e:
            return ToolResult(
                success=False,
                error_type=ToolErrorType.UNKNOWN,
                error_message=str(e)
            )
    return wrapper

Key Takeaways

Layer your integrations: Separate tool definitions from business logic and error handling
Always handle failures: Network calls fail, APIs rate limit, databases time out
Security is non-negotiable: Validate inputs, restrict access, protect credentials
Provide context to LLMs: Format external data clearly so the model can use it effectively
Log everything: External calls should be logged for debugging and monitoring

With external connections, agents transform from text processors into capable systems that can research, query, and act. In the next post, I’ll explore agentic RAG - dynamically retrieving information to augment agent knowledge - and strategies for evaluating agent performance.

This is Part 10 of my series on building intelligent AI systems. Next: agentic RAG and agent evaluation strategies.

#llm #multi-agent #python #api

Connecting Agents to the World - External APIs and Data

The Integration Challenge

Web Search Integration

Building a Research Agent

Database Connections

Natural Language to SQL

REST API Integration

File System Operations

Building a Multi-Tool Agent

Security Considerations

Input Validation

Rate Limiting

Credential Management

Error Handling Patterns

Key Takeaways

Comments

Your browser is out-of-date!

Connecting Agents to the World - External APIs and Data

The Integration Challenge

Web Search Integration

Building a Research Agent

Database Connections

Natural Language to SQL

REST API Integration

File System Operations

Building a Multi-Tool Agent

Security Considerations

Input Validation

Rate Limiting

Credential Management

Error Handling Patterns

Key Takeaways

Related Posts

Comments

Your browser is out-of-date!