LLM tool use has moved from experimental to essential. Every major framework now supports function calling, and the patterns for connecting language models to real-world systems are converging. But which patterns actually work in production, and which ones create more problems than they solve?
This guide covers the top tool use patterns that teams are deploying in 2025, with concrete implementations and honest trade-offs for each.
Key Takeaways
- Function calling is the foundation — everything else builds on structured tool definitions
- ReAct (Reason + Act) loops remain the most reliable pattern for multi-step tasks
- Parallel tool calling cuts latency dramatically for independent operations
- MCP (Model Context Protocol) is emerging as the standard for tool discovery and integration
- The right pattern depends on your latency budget, error tolerance, and task complexity
1. Direct Function Calling
The simplest pattern: define a function schema, let the model decide when to call it, return the result. OpenAI, Anthropic, and Google all support this natively.
import openai
client = openai.OpenAI()
tools = [{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
"num_results": {"type": "integer", "default": 5}
},
"required": ["query"]
}
}
}]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the latest on the Rust language ecosystem?"}],
tools=tools,
)
# If the model wants to call a tool
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
# Execute the tool, return result, continue conversation
Best for: Single-step lookups, API calls, structured data retrieval Trade-off: Each tool call adds a round trip — expensive for multi-step reasoning
2. ReAct Loop (Reason + Act)
The workhorse pattern for agents. The model reasons about what to do, takes an action, observes the result, and repeats until the task is done. LangChain, CrewAI, and most agent frameworks implement variations of this.
from searchhive import SearchHiveClient
client = SearchHiveClient(api_key="your_key")
def react_loop(task: str, max_steps: int = 5):
messages = [{"role": "user", "content": task}]
for step in range(max_steps):
# Model reasons about next action
response = call_llm(messages, tools=available_tools)
if not response.tool_calls:
return response.content # Final answer
# Execute each tool call
for tool_call in response.tool_calls:
if tool_call.function.name == "web_search":
result = client.search(tool_call.function.arguments["query"])
messages.append(tool_result(tool_call.id, str(result)))
return "Max steps reached without final answer"
Best for: Research tasks, multi-step workflows, anything requiring external data
Trade-off: Slow — each step is an API call. Can loop if the model gets stuck. Always set max_steps.
3. Parallel Tool Calling
When a task requires multiple independent lookups, fire them all at once instead of sequentially. GPT-4o and Claude both support parallel tool calls in a single response:
# Model returns multiple tool calls at once
# "Find the pricing for ScrapingBee, Oxylabs, and ZenRows"
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
)
# Execute all calls concurrently
import asyncio
async def execute_parallel(tool_calls):
tasks = [execute_tool(call) for call in tool_calls]
return await asyncio.gather(*tasks)
With SearchHive's ScrapeForge API, you can parallelize web scraping across multiple URLs — batch up to 10 pages per call:
results = client.scrape_batch(
urls=[
"https://scrapingbee.com/pricing",
"https://oxylabs.io/pricing",
"https://zenrows.com/pricing",
],
format="markdown"
)
Best for: Comparison tasks, batch lookups, any scenario with independent sub-queries Trade-off: Higher token usage per turn. Model needs to understand independence — won't parallelize correctly if tool B depends on tool A's output.
4. Tool Routing with Orchestrator
Instead of giving the model every tool, use a lightweight router that selects the right tool subset for the task. This reduces token waste and improves accuracy:
def route_tools(query: str) -> list[dict]:
"""Select relevant tools based on query intent."""
query_lower = query.lower()
selected = []
if any(w in query_lower for w in ["search", "find", "look up", "latest"]):
selected.append(search_tool)
if any(w in query_lower for w in ["scrape", "extract", "crawl", "page content"]):
selected.append(scrape_tool)
if any(w in query_lower for w in ["research", "analyze", "deep dive", "compare"]):
selected.append(research_tool)
return selected or [search_tool] # Default to search
# Use fewer tools = cheaper, more accurate calls
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=route_tools(user_query),
)
Best for: Systems with many tools (10+), cost-sensitive deployments, reducing hallucinated tool calls Trade-off: Router can miss edge cases. A keyword-based router is simple but brittle — consider using a small embedding model for semantic routing.
5. MCP (Model Context Protocol) Integration
MCP is the emerging standard for tool discovery. Instead of hardcoding tool schemas, the model discovers available tools at runtime from MCP servers. Anthropic built it, and the ecosystem is growing fast.
{
"mcpServers": {
"searchhive": {
"command": "npx",
"args": ["-y", "@searchhive/mcp-server"],
"env": {
"SEARCHHIVE_API_KEY": "your_key"
}
}
}
}
Once configured, Claude (and other MCP-compatible models) automatically discovers and uses SearchHive's tools — SwiftSearch, ScrapeForge, DeepDive — without manual schema definitions. /solutions/mcp
Best for: IDE integrations, extensible tool ecosystems, zero-config tool use Trade-off: New protocol — not all models support it yet. Debugging MCP connections can be opaque.
Quick Comparison
| Pattern | Latency | Complexity | Token Cost | Best For |
|---|---|---|---|---|
| Direct Function Calling | Low | Low | Low | Single lookups |
| ReAct Loop | High | Medium | Medium | Multi-step research |
| Parallel Calling | Low-Med | Medium | Medium | Batch operations |
| Tool Routing | Low | Medium | Low | Large tool sets |
| MCP Integration | Varies | Low (setup) | Varies | Extensible systems |
Which Pattern Should You Use?
Start with direct function calling. It solves 80% of use cases — lookups, API integrations, structured data retrieval. When you hit multi-step tasks that the single-call pattern can't handle, move to ReAct. Add parallel calls when latency matters. Add routing when your tool list grows past 10. And start experimenting with MCP now — it's where the ecosystem is heading.
SearchHive's APIs are designed to work across all these patterns. SwiftSearch for instant lookups, ScrapeForge for batch page extraction, DeepDive for multi-source research. The Python SDK supports both sync and async, making it a drop-in tool for any agent framework. Check the integration docs for LangChain, CrewAI, and MCP setup guides.