LangChain is the most popular framework for building LLM applications, and web search is one of its most powerful capabilities. By connecting your LLM to a search API, you enable real-time data access, fact-checking, and grounded responses.
This FAQ covers how LangChain integrates with web search APIs, code examples for every major search provider, and practical patterns for production applications.
Key Takeaways
- LangChain's Tool abstraction makes any search API callable by your LLM -- the model decides when to search
- SearchHive integrates as a LangChain tool with search, scraping, and deep research capabilities
- Tavily has a native LangChain integration but costs 4-5x more at equivalent volumes
- SerpApi and Brave work through generic REST tool wrappers
- The search-retrieval-generate pattern is the standard approach for grounded LLM responses
- Always validate search results before passing them to the LLM -- garbage in, garbage out
How does LangChain integrate with web search?
LangChain provides a Tool abstraction that lets LLMs call external functions, including web search APIs. The workflow:
- Define a search tool (or use a pre-built one)
- Bind the tool to your LLM
- The LLM decides when to call the tool based on the user's question
- The tool returns search results
- The LLM uses those results to generate a grounded answer
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
# Define a custom search tool using SearchHive
@tool
def web_search(query: str) -> str:
# Search the web for current information
import requests
response = requests.get(
"https://api.searchhive.dev/swiftsearch",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"q": query, "limit": 5}
).json()
results = []
for r in response.get("results", []):
results.append(f"- {r['title']}: {r['snippet']} ({r['url']})")
return "\n".join(results)
# Bind to LLM
llm = ChatOpenAI(model="gpt-4o").bind_tools([web_search])
# The LLM will call the search tool when it needs current information
response = llm.invoke("What were the top tech acquisitions in Q1 2026?")
What search APIs work with LangChain?
Most search APIs work with LangChain through the @tool decorator pattern above. Some have pre-built integrations:
| API | Pre-built LangChain Tool | Install |
|---|---|---|
| Tavily | Yes (langchain-tavily) | pip install langchain-tavily |
| SerpApi | Yes (langchain-community) | pip install langchain-community |
| Brave | Yes (langchain-community) | pip install langchain-community |
| Yes (deprecated) | Do not use for new projects | |
| SearchHive | Custom tool (see below) | REST API |
| Exa | Yes (langchain-exa) | pip install langchain-exa |
How do I use SearchHive with LangChain?
SearchHive provides three LangChain tools: search, scrape, and deep research.
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
API_KEY = "YOUR_API_KEY"
@tool
def searchhive_search(query: str) -> str:
# Search the web using SearchHive SwiftSearch
# Returns relevant URLs and snippets
import requests
response = requests.get(
"https://api.searchhive.dev/swiftsearch",
headers={"Authorization": f"Bearer {API_KEY}"},
params={"q": query, "limit": 5}
).json()
return "\n".join(
f"{r['title']}: {r['snippet']} ({r['url']})"
for r in response.get("results", [])
)
@tool
def searchhive_scrape(url: str) -> str:
# Scrape a web page and return its content as markdown
# Use this to read the full content of a page found via search
import requests
response = requests.get(
"https://api.searchhive.dev/scrapeforge",
headers={"Authorization": f"Bearer {API_KEY}"},
params={"url": url, "format": "markdown"}
).json()
return response.get("markdown", "Failed to scrape page")
@tool
def searchhive_research(query: str) -> str:
# Perform deep research on a topic
# Returns a comprehensive answer with citations
import requests
response = requests.post(
"https://api.searchhive.dev/deepdive",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"query": query, "depth": "comprehensive"}
).json()
return response.get("answer", "Research failed")
# Bind all three tools to the LLM
llm = ChatOpenAI(model="gpt-4o").bind_tools([
searchhive_search,
searchhive_scrape,
searchhive_research
])
# The LLM can now search, scrape pages, or do deep research as needed
response = llm.invoke("Compare the pricing of Vercel, Netlify, and Cloudflare Pages")
This gives your LLM three capabilities:
- searchhive_search: Quick web lookups
- searchhive_scrape: Read full page content from a URL
- searchhive_research: Multi-step deep research for complex questions
How does Tavily integrate with LangChain?
Tavily has the most polished LangChain integration:
from langchain_tavily import TavilySearch, TavilyAnswer
# Basic search
search = TavilySearch(max_results=5, api_key="YOUR_TAVILY_KEY")
results = search.invoke("latest AI news")
# Direct answer (no need for separate LLM call)
answer = TavilyAnswer(api_key="YOUR_TAVILY_KEY")
result = answer.invoke("What is RAG?")
The tradeoff: Tavily's LangChain integration is smoother out of the box, but at $0.008/credit, 100K queries costs roughly $800. SearchHive gives you equivalent functionality for $49/month.
What is the ReAct pattern for search + LLM?
The ReAct (Reasoning + Acting) pattern is LangChain's standard approach for combining LLM reasoning with tool use:
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
# Define tools
tools = [searchhive_search, searchhive_scrape]
prompt_template = (
"Answer the following questions as best you can. You have access to the following tools:\n\n"
"{tools}\n\n"
"Use the following format:\n"
"Question: the input question\n"
"Thought: you should always think about what to do\n"
"Action: the action to take\n"
"Action Input: the input to the action\n"
"Observation: the result of the action\n"
"... (repeat Thought/Action/Action Input/Observation)\n"
"Thought: I now know the final answer\n"
"Final Answer: the final answer to the original input question\n\n"
"Begin!\n"
"Question: {input}\n"
"{agent_scratchpad}"
)
llm = ChatOpenAI(model="gpt-4o")
agent = create_react_agent(llm, tools, prompt_template)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# The agent searches, reads pages, and reasons through to an answer
result = executor.invoke(
{"input": "What is SearchHive pricing and how does it compare to SerpApi?"}
)
The agent autonomously decides when to search, what to search for, and when it has enough information to answer.
How do I build a RAG pipeline with LangChain and search?
For Retrieval-Augmented Generation with live web data:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
def web_rag(question: str) -> str:
# Step 1: Search
import requests
search = requests.get(
"https://api.searchhive.dev/swiftsearch",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"q": question, "limit": 3}
).json()
# Step 2: Retrieve (scrape top results)
context = []
for r in search["results"][:3]:
page = requests.get(
"https://api.searchhive.dev/scrapeforge",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"url": r["url"], "format": "markdown"}
).json()
context.append(page.get("markdown", "")[:2000])
# Step 3: Generate
llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
("system", "Answer based on the provided context. Cite your sources."),
("user", "Context:\n{context}\n\nQuestion: {question}")
])
chain = prompt | llm
response = chain.invoke({
"context": "\n---\n".join(context),
"question": question
})
return response.content
answer = web_rag("What are the main features of Python 3.13?")
print(answer)
How does LangChain handle search result validation?
LangChain doesn't validate search results automatically. You should add validation:
@tool
def validated_search(query: str) -> str:
# Search the web and filter out low-quality results
import requests
response = requests.get(
"https://api.searchhive.dev/swiftsearch",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"q": query, "limit": 10}
).json()
# Filter: require minimum snippet length and valid URL
valid = []
for r in response.get("results", []):
if len(r.get("snippet", "")) > 50 and r["url"].startswith("https"):
valid.append(f"- {r['title']}: {r['snippet']}")
return "\n".join(valid[:5]) if valid else "No relevant results found."
Which search API should I use with LangChain?
- Best value: SearchHive -- search + scrape + research, $49/month for 100K credits, easy
@toolintegration - Easiest setup: Tavily -- pre-built LangChain package, but expensive at scale
- Multi-engine search: SerpApi -- Google, YouTube, etc., but steep pricing
- Independent index: Brave -- good quality, no scraping capability
- Neural search: Exa -- unique semantic approach, higher cost
Get started
SearchHive's free tier (500 credits) is enough to build and test your LangChain search integration. Get your API key: https://searchhive.dev
For more on search APIs, see /blog/what-is-the-best-search-api-for-llms-complete-answer and /compare/tavily.