← back to posts

Building Custom Tools for CrewAI Agents

In the last post, we built a basic crew with simple tool decorators. Real agents, though, need better.

The difference between an effective agent and a useless one often comes down to tools. A researcher agent with access to web search, document retrieval, and fact-checking capabilities becomes genuinely useful. Without proper tools, it’s just an expensive chatbot making things up.

This post dives into tool design—the overlooked art of giving agents the right capabilities in the right way.

Tool Design Principles

Before writing code, understand what makes tools work well for agents.

1. Clear Input/Output Contracts

An agent understands your tool through its name, docstring, and parameter types. A vague tool confuses the agent’s decision-making.

Bad:

1
2
3
4
@tool("process")
def process(data):
    """Process the data."""
    # Agent has no idea what this does or how to use it

Good:

1
2
3
4
5
6
7
@tool("analyze_sentiment")
def analyze_sentiment(text: str) -> str:
    """Analyze the sentiment of provided text.
    
    Returns: JSON string with 'sentiment' (positive/negative/neutral)
    and 'confidence' (0.0-1.0).
    """

Tools must return strings—the output feeds directly back into the LLM as text. Return json.dumps(data) for structured data, not raw Python dicts.

2. Single Responsibility

Tools should do one thing. A “do_everything” tool creates decision paralysis.

Bad: A tool that searches, summarizes, translates, and fact-checks in one call.

Good: Separate search_web, summarize_text, check_facts.

An agent becomes effective when it chains focused tools together.

3. Graceful Error Handling

Tools fail. APIs timeout. Databases go down. Your tool must return meaningful error information as a string, not crash.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import requests
from crewai.tools import tool  # decorator lives in crewai.tools, not crewai_tools

@tool("fetch_stock_price")
def fetch_stock_price(ticker: str) -> str:
    """Fetch current stock price for a given ticker.
    
    Args:
        ticker: Stock symbol (e.g., 'AAPL')
        
    Returns:
        JSON string with price, currency, and timestamp, or an error message.
    """
    import json
    try:
        response = requests.get(f"https://api.stocks.com/price/{ticker}", timeout=5)
        response.raise_for_status()
        return json.dumps(response.json())
    except requests.Timeout:
        return f"Error: Timeout fetching {ticker}. Try again later."
    except requests.HTTPError as e:
        if e.response.status_code == 404:
            return f"Error: Ticker {ticker} not found."
        return f"Error: API returned {e.response.status_code}."
    except Exception as e:
        return f"Error: Unexpected failure: {str(e)}"

When the agent sees an error string, it can retry, pivot, or escalate. That’s infinitely better than a crashed workflow.

4. Performance Awareness

Agents are expensive. Tools that take 30 seconds or fetch massive datasets blow token budgets fast.

  • Cache results when possible (exchange rates, weather, static data)
  • Paginate API responses
  • Return structured summaries, not raw data dumps
  • Log performance to catch slow tools early

Building Custom Tools

Let’s move beyond the basics to real-world patterns.

Pattern 1: API Wrapper

Wrapping external APIs is the most common use case. Here’s a clean approach separating the client from the tool:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import requests
import json
from typing import Optional
from crewai.tools import tool

class NewsAPIClient:
    """Wraps News API with rate limiting."""
    
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.base_url = "https://newsapi.org/v2"
        self.last_request_time = 0
        self.min_interval = 0.1  # Rate limit: 10 requests/sec
    
    def search(self, query: str, limit: int = 5) -> list:
        import time
        elapsed = time.time() - self.last_request_time
        if elapsed < self.min_interval:
            time.sleep(self.min_interval - elapsed)
        
        params = {
            "q": query,
            "sortBy": "publishedAt",
            "pageSize": limit,
            "apiKey": self.api_key
        }
        
        try:
            response = requests.get(
                f"{self.base_url}/everything",
                params=params,
                timeout=10
            )
            response.raise_for_status()
            articles = response.json().get("articles", [])
            self.last_request_time = time.time()
            
            return [
                {
                    "title": a["title"],
                    "source": a["source"]["name"],
                    "url": a["url"],
                    "published": a["publishedAt"]
                }
                for a in articles
            ]
        except requests.RequestException as e:
            return [{"error": f"News search failed: {str(e)}"}]

news_client = NewsAPIClient(api_key="your_key_here")

@tool("search_news")
def search_news(query: str, limit: int = 5) -> str:
    """Search for recent news articles on a topic.
    
    Args:
        query: Topic to search for (e.g., 'AI regulations')
        limit: Number of articles to return (1-5)
        
    Returns:
        Formatted list of article titles, sources, and URLs.
    """
    results = news_client.search(query, min(limit, 5))
    
    if results and "error" in results[0]:
        return f"Error: {results[0]['error']}"
    
    formatted = "\n".join([
        f"• {r['title']}\n  Source: {r['source']}\n  {r['url']}"
        for r in results
    ])
    
    return f"Found {len(results)} articles:\n\n{formatted}"

The client handles API details (auth, rate limiting). The tool wraps the client with a clean string interface for the agent.

Pattern 2: Database Query Tool

Agents often need data from internal systems:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
import sqlite3
import json
from contextlib import contextmanager
from typing import Optional
from crewai.tools import tool

class DatabaseQuery:
    
    def __init__(self, db_path: str):
        self.db_path = db_path
        self.allowed_tables = {"users", "products", "orders"}
    
    @contextmanager
    def get_connection(self):
        conn = sqlite3.connect(self.db_path)
        try:
            yield conn
        finally:
            conn.close()
    
    def query(self, table: str, filters: dict) -> list:
        if table not in self.allowed_tables:
            raise ValueError(f"Access denied to table '{table}'")
        
        where_clause = " AND ".join([f"{k} = ?" for k in filters.keys()])
        query_str = f"SELECT * FROM {table}"
        if where_clause:
            query_str += f" WHERE {where_clause}"
        
        with self.get_connection() as conn:
            cursor = conn.cursor()
            cursor.execute(query_str, tuple(filters.values()))
            rows = cursor.fetchall()
            cols = [d[0] for d in cursor.description]
            return [dict(zip(cols, row)) for row in rows]

db = DatabaseQuery("/data/app.db")

@tool("query_users")
def query_users(email: Optional[str] = None, status: Optional[str] = None) -> str:
    """Query the user database.
    
    Args:
        email: Email address to search for (optional)
        status: User status filter, e.g. 'active' or 'suspended' (optional)
        
    Returns:
        Formatted list of matching users, or an error message.
    """
    filters = {}
    if email:
        filters["email"] = email
    if status:
        filters["status"] = status
    
    try:
        results = db.query("users", filters)
    except (ValueError, Exception) as e:
        return f"Error: {str(e)}"
    
    if not results:
        return "No users found matching the criteria."
    
    return "\n".join([
        f"User: {u['name']} ({u['email']}) - Status: {u['status']}"
        for u in results
    ])

Note Optional[str] on the parameters—CrewAI infers the tool’s JSON schema from type annotations, so missing Optional tells the LLM these are required fields.

Pattern 3: Tool Composition

Sometimes you want to expose a higher-level capability that combines multiple data sources. The key is to extract the logic into plain helper functions, then wrap both the helpers and the composed tool separately:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import json
from crewai.tools import tool

# Plain helper functions — not decorated, just regular Python
def _fetch_news(company: str) -> str:
    """Internal news fetch — call the news client directly."""
    results = news_client.search(f"{company} company news")
    if results and "error" in results[0]:
        return f"News unavailable: {results[0]['error']}"
    return "\n".join([f"• {r['title']} ({r['source']})" for r in results[:3]])

def _fetch_stock(ticker: str) -> str:
    """Internal stock fetch — stub; replace with real API."""
    return json.dumps({"pe_ratio": 15.2, "revenue_growth": 0.23})

# Composed tool calls helpers, not other @tool-decorated functions
@tool("research_company")
def research_company(company_name: str) -> str:
    """Research a company using multiple data sources.
    
    Args:
        company_name: Company name or stock ticker (e.g., 'Apple' or 'AAPL')
        
    Returns:
        Combined news and financial summary.
    """
    news = _fetch_news(company_name)
    stock_info = _fetch_stock(company_name)
    
    return f"Company: {company_name}\n\nRecent News:\n{news}\n\nFinancials:\n{stock_info}"

This gives agents a high-level research capability without needing to chain individual tools.

Integrating LangChain Tools

CrewAI plays nicely with LangChain’s tool ecosystem. If you already have LangChain tools, reuse them directly:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Modern LangChain imports (post langchain 0.1 split)
from langchain_core.tools import Tool
from langchain_google_community import GoogleSearchAPIWrapper
from crewai import Agent, LLM

search = GoogleSearchAPIWrapper()
search_tool = Tool(
    name="Google Search",
    func=search.run,
    description="Search Google for recent information on any topic"
)

# Pass to CrewAI agent — no conversion needed
researcher = Agent(
    role="Research Analyst",
    goal="Find relevant information",
    tools=[search_tool],
    llm=LLM(model="gpt-4o")
)

Note the updated imports: langchain_core.tools and langchain_google_community replaced the old langchain.tools / langchain.utilities paths after the 0.1 package split.

Common Pitfalls

1. Returning dicts from tools. Tools must return strings. Return json.dumps(data) for structured data—the LLM reads tool output as text.

2. Calling decorated tools internally. A @tool-decorated function is a Tool object, not a plain function. Inside other tools or helpers, call the underlying logic directly (extract it to a helper function first).

3. Over-verbose responses. Don’t return full API responses. Summarize. Agents drown in data and token costs spike.

4. No timeouts. An API can hang indefinitely. Always set timeout= on HTTP requests.

5. Missing Optional types. Parameters annotated as str = None without Optional[str] tell the LLM they’re required fields. Use Optional[str] = None with a from typing import Optional import.

6. Tools that raise exceptions. An unhandled exception in a tool crashes the agent. Catch errors and return them as strings instead.

Performance Considerations

As agents scale, tool performance matters.

Caching with TTL:

Simple lru_cache has no expiration—it caches forever. For time-sensitive data, use cachetools:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from cachetools import TTLCache, cached

# Cache up to 128 entries, each valid for 5 minutes
_rate_cache = TTLCache(maxsize=128, ttl=300)

@cached(_rate_cache)
def _get_exchange_rate_cached(currency_pair: str) -> float:
    """Fetch exchange rate with automatic 5-minute TTL."""
    response = requests.get(f"https://api.exchangerate.host/{currency_pair}", timeout=5)
    return response.json()["rate"]

@tool("get_exchange_rate")
def get_exchange_rate(currency_pair: str) -> str:
    """Get the current exchange rate for a currency pair (e.g., 'USD/EUR').
    
    Results are cached for 5 minutes to reduce API calls.
    """
    try:
        rate = _get_exchange_rate_cached(currency_pair)
        return f"Exchange rate for {currency_pair}: {rate}"
    except Exception as e:
        return f"Error fetching rate for {currency_pair}: {str(e)}"

Async Parallel Fetching:

When a tool needs data from multiple endpoints, fetch them concurrently:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import asyncio
import aiohttp
from typing import List

async def _fetch_url(session: aiohttp.ClientSession, url: str) -> dict:
    """Fetch a single URL, returning parsed JSON or an error dict."""
    try:
        async with session.get(url, timeout=aiohttp.ClientTimeout(total=10)) as resp:
            return await resp.json()
    except Exception as e:
        return {"error": str(e), "url": url}

async def _fetch_all(urls: List[str]) -> List[dict]:
    """Fetch all URLs in parallel."""
    async with aiohttp.ClientSession() as session:
        tasks = [_fetch_url(session, url) for url in urls]
        return await asyncio.gather(*tasks)

@tool("bulk_price_check")
def bulk_price_check(tickers: str) -> str:
    """Fetch stock prices for multiple tickers in parallel.
    
    Args:
        tickers: Comma-separated list of tickers (e.g., 'AAPL,MSFT,GOOG')
        
    Returns:
        Prices for all requested tickers.
    """
    ticker_list = [t.strip() for t in tickers.split(",")]
    urls = [f"https://api.stocks.com/price/{t}" for t in ticker_list]
    results = asyncio.run(_fetch_all(urls))
    return json.dumps(dict(zip(ticker_list, results)))

Batching:

Instead of one tool call per item, accept a list:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
@tool("analyze_multiple_texts")
def analyze_multiple_texts(texts_json: str) -> str:
    """Analyze sentiment for multiple texts in a single call.
    
    Args:
        texts_json: JSON array of strings to analyze (e.g., '["text1", "text2"]')
        
    Returns:
        JSON array of sentiment results.
    """
    try:
        texts = json.loads(texts_json)
    except json.JSONDecodeError:
        return "Error: texts_json must be a valid JSON array of strings."
    
    results = [_analyze_single(text) for text in texts]
    return json.dumps(results)

What’s Next

Tools are the foundation of agent effectiveness. Design them well, and your agents shine. Design them poorly, and no orchestration framework saves you.

The next post in this series will cover memory and state management—how agents learn from past executions and maintain context across complex workflows.

For now: focus on clear contracts, string returns, error handling, and performance. Test your tools with the agents that will use them. Iterate on the interface until it feels natural.


This is part 2 of the CrewAI series. Part 1: Getting Started with CrewAI