How to Integrate LLM Search: Step-by-Step Guide for 2026
LLM search integration connects large language models to real-time web data, reducing hallucinations and grounding responses in current information. Whether you're building a RAG pipeline, an AI agent, or a chatbot that needs up-to-date answers, integrating search into your LLM workflow is a fundamental requirement.
This guide walks through building a complete LLM search integration using SearchHive's APIs and OpenAI's function calling -- no complex framework needed.
Key Takeaways
- LLM search integration grounds model outputs in real-time data, cutting hallucinations significantly
- SearchHive SwiftSearch provides web results; ScrapeForge extracts full page content for RAG
- OpenAI function calling creates a clean search-as-a-tool pattern
- The complete pipeline: user query, search, extract context, augment prompt, generate response
- Works with any LLM that supports function/tool calling (OpenAI, Claude, Gemini, Llama)
Prerequisites
- Python 3.8+
- OpenAI API key (or Anthropic/Llama API for alternative LLMs)
- SearchHive API key -- sign up free for 500 credits
- Basic understanding of LLM APIs and function calling
pip install openai requests
Step 1: Set Up SearchHive as a Search Tool
Define a search function that wraps SearchHive's SwiftSearch API. This becomes the tool your LLM can call.
import requests
import json
SEARCHHIVE_API_KEY = "your-searchhive-api-key"
BASE = "https://api.searchhive.dev/v1"
def web_search(query, limit=5):
"""Search the web using SearchHive SwiftSearch."""
response = requests.get(
f"{BASE}/swiftsearch",
headers={"Authorization": f"Bearer {SEARCHHIVE_API_KEY}"},
params={
"query": query,
"limit": limit,
"fresh": "month",
}
)
data = response.json()
results = []
for r in data.get("results", []):
results.append({
"title": r.get("title", ""),
"url": r.get("url", ""),
"snippet": r.get("snippet", ""),
})
return results
Step 2: Add Content Extraction for RAG
Search snippets are often too short for RAG. Use ScrapeForge to extract full page content from the most relevant results.
def extract_page_content(url):
"""Extract full page content using SearchHive ScrapeForge."""
response = requests.post(
f"{BASE}/scrapeforge",
headers={"Authorization": f"Bearer {SEARCHHIVE_API_KEY}"},
json={
"url": url,
"format": "markdown",
"wait_for": 2000,
}
)
data = response.json()
return data.get("content", "")
def search_and_extract(query, max_pages=3):
"""Search and extract content from top results."""
results = web_search(query, limit=5)
# Extract content from the most relevant pages
contexts = []
for r in results[:max_pages]:
try:
content = extract_page_content(r["url"])
contexts.append({
"title": r["title"],
"url": r["url"],
"content": content[:3000], # truncate to manage token usage
})
except Exception as e:
print(f"Error extracting {r['url']}: {e}")
return contexts
Step 3: Integrate with OpenAI Function Calling
Define the search tool schema and wire it into OpenAI's chat completions API.
from openai import OpenAI
client = OpenAI() # uses OPENAI_API_KEY env var
# Define the search tool for function calling
search_tool = {
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information. Use this for questions about recent events, current prices, up-to-date data, or anything that may have changed after your training cutoff.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query to find relevant information.",
}
},
"required": ["query"],
},
},
}
def handle_tool_call(tool_call):
"""Execute a tool call from the LLM."""
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
if function_name == "web_search":
return web_search(arguments["query"])
else:
return "Unknown tool"
def build_context_message(search_results):
"""Build a context message from search results."""
if not search_results:
return ""
context_parts = []
for r in search_results:
context_parts.append(f"### {r['title']}\n{r['snippet']}\nSource: {r['url']}")
return "## Search Results\n\n" + "\n\n".join(context_parts)
Step 4: Build the Complete RAG Pipeline
Combine search, extraction, and LLM generation into a complete pipeline.
def chat_with_search(user_message, conversation_history=None):
"""Chat with an LLM that can search the web when needed."""
if conversation_history is None:
conversation_history = []
# Initial messages
messages = conversation_history + [
{"role": "user", "content": user_message}
]
# First LLM call -- decide whether to search
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=[search_tool],
tool_choice="auto",
)
message = response.choices[0].message
# If the LLM wants to search, execute the search and call again
if message.tool_calls:
for tool_call in message.tool_calls:
print(f"Searching: {json.loads(tool_call.function.arguments)['query']}")
search_results = handle_tool_call(tool_call)
# Add tool result to conversation
messages.append(message)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(search_results),
})
# Second LLM call with search context
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
)
message = response.choices[0].message
return message.content
Step 5: Add DeepDive Research for Complex Queries
For complex research tasks, use SearchHive's DeepDive API for deeper analysis across multiple sources.
def deep_research(query):
"""Perform deep research using SearchHive DeepDive."""
response = requests.post(
f"{BASE}/deepdive",
headers={"Authorization": f"Bearer {SEARCHHIVE_API_KEY}"},
json={
"query": query,
"depth": "comprehensive",
"max_sources": 10,
}
)
return response.json()
# Example: research a complex topic
research = deep_research("current state of AI regulation in the EU and US 2026")
print(research.get("summary", "")[:500])
DeepDive returns a synthesized research summary alongside source references, making it ideal for complex queries where a single search isn't enough.
Complete Code Example
import requests
import json
from openai import OpenAI
# Configuration
SEARCHHIVE_API_KEY = "your-searchhive-api-key"
BASE = "https://api.searchhive.dev/v1"
client = OpenAI()
def web_search(query, limit=5):
response = requests.get(
f"{BASE}/swiftsearch",
headers={"Authorization": f"Bearer {SEARCHHIVE_API_KEY}"},
params={"query": query, "limit": limit, "fresh": "month"},
)
return response.json().get("results", [])
def extract_page_content(url):
response = requests.post(
f"{BASE}/scrapeforge",
headers={"Authorization": f"Bearer {SEARCHHIVE_API_KEY}"},
json={"url": url, "format": "markdown", "wait_for": 2000},
)
return response.json().get("content", "")
search_tool = {
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information about recent events, prices, or data.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query."},
},
"required": ["query"],
},
},
}
def chat_with_search(user_message):
messages = [{"role": "user", "content": user_message}]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=[search_tool],
tool_choice="auto",
)
msg = response.choices[0].message
if msg.tool_calls:
for tc in msg.tool_calls:
args = json.loads(tc.function.arguments)
results = web_search(args["query"])
messages.append(msg)
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(results),
})
response = client.chat.completions.create(
model="gpt-4o", messages=messages,
)
msg = response.choices[0].message
return msg.content
if __name__ == "__main__":
answer = chat_with_search("What are the latest developments in quantum computing in 2026?")
print(answer)
Common Issues and Solutions
LLM doesn't call the search tool: Make sure your tool description clearly states when the LLM should use it. Include phrases like "current information" and "recent events" to trigger tool use for time-sensitive queries.
Token limits exceeded from long page content: Truncate extracted content to 2,000-3,000 characters per page. For RAG pipelines, use chunking and embed the chunks, then retrieve only relevant chunks.
Rate limiting from SearchHive: The free tier (500 credits) handles ~50 search+extract cycles. Upgrade to Starter ($9/month, 5K credits) or Builder ($49/month, 100K credits) for production use.
Stale search results: Use the fresh parameter to filter results by recency. "24h" for the last day, "week" for the last week, "month" for the last month.
Using non-OpenAI LLMs: The same pattern works with Anthropic's tool use, Google Gemini's function calling, or local models via Ollama. Just adapt the tool schema format and response handling to your LLM provider's API.
Next Steps
- Add conversation memory -- maintain a conversation history list and pass it to each API call
- Implement source citations -- parse search results and include URLs in your LLM response
- Build a web interface -- wrap the pipeline in a FastAPI server or Streamlit app
- Add evaluation -- test grounded vs. ungrounded responses for accuracy metrics
- Optimize costs -- cache search results for repeated queries to reduce API calls
For more on building AI-powered applications, check out our guides on search APIs for AI agents and web search RAG pipelines.
Get started with SearchHive free -- 500 credits, no credit card. Build your LLM search integration in under 30 minutes with our quickstart guide.