How to Integrate Web Search into LangChain — Complete Guide
LangChain's tool ecosystem is one of its biggest strengths. With over 1,000 integrations, it provides pre-built tools for dozens of search engines, letting you add web search to any chain or agent in a few lines of code. This tutorial walks through every option, from free built-in tools to production-grade search APIs.
Key Takeaways
- LangChain has seven built-in search tools: Bing, Brave, DuckDuckGo, Exa, Google, Serper, and cloro
- DuckDuckGo and Brave offer free tiers with no API key required
- SearchHive provides search + scraping + deep research in a single LangChain tool — significantly cheaper than combining multiple providers
- The
Tooldecorator pattern makes it trivial to wrap any REST API as a LangChain tool
Prerequisites
- Python 3.9+
- LangChain installed:
pip install langchain langchain-openai langchain-community - An OpenAI API key (for the LLM)
- API keys for whichever search providers you choose
pip install langchain langchain-openai langchain-community
Step 1: Using Built-in Free Search Tools
LangChain ships with two search tools that work out of the box with no API key:
DuckDuckGo Search
from langchain_community.tools import DuckDuckGoSearchRun
search = DuckDuckGoSearchRun()
# Simple search
results = search.invoke("Python web scraping libraries 2026")
print(results)
# With more control
from langchain_community.tools import DuckDuckGoSearchResults
search = DuckDuckGoSearchResults(max_results=5, output_format="list")
results = search.invoke("latest AI news")
for r in results:
print(f"- {r['title']}: {r['link']}")
Brave Search (free tier)
from langchain_community.utilities import BraveSearchWrapper
# You need a Brave API key (free $5/month credit)
search = BraveSearchWrapper(api_key="your-brave-key", search_kwargs={"count": 5})
results = search.results("AI search APIs comparison", count=5)
Step 2: Using Serper.dev (Recommended Free Option)
Serper.dev provides 2,500 free queries and returns clean Google results:
from langchain_community.utilities import SerpAPIWrapper
import os
os.environ["SERPAPI_API_KEY"] = "your-serper-key"
search = SerpAPIWrapper()
results = search.run("best web search APIs for developers")
print(results)
Step 3: Wrapping SearchHive as a LangChain Tool
SearchHive provides three APIs — SwiftSearch, ScrapeForge, and DeepDive. Wrapping them as LangChain tools gives your chains access to search, scraping, and deep research:
import requests
from langchain_core.tools import tool
from pydantic import BaseModel, Field
SEARCHHIVE_KEY = "your-searchhive-key"
SEARCHHIVE_BASE = "https://api.searchhive.dev/v1"
class SwiftSearchInput(BaseModel):
query: str = Field(description="The search query to look up on the web")
class ScrapeForgeInput(BaseModel):
url: str = Field(description="The full URL of the web page to scrape")
class DeepDiveInput(BaseModel):
query: str = Field(description="The research question to investigate in depth")
@tool(args_schema=SwiftSearchInput)
def swift_search(query: str) -> str:
# Search the web for current information
resp = requests.get(
f"{SEARCHHIVE_BASE}/swift-search",
headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
params={"query": query, "limit": 5}
)
data = resp.json()
results = []
for r in data.get("results", [])[:5]:
results.append(f"{r['title']}\n URL: {r['url']}\n {r['snippet']}")
return "\n\n".join(results)
@tool(args_schema=ScrapeForgeInput)
def scrape_forge(url: str) -> str:
# Extract clean content from a web page as markdown
resp = requests.post(
f"{SEARCHHIVE_BASE}/scrape-forge",
headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}", "Content-Type": "application/json"},
json={"url": url, "format": "markdown"}
)
return resp.json().get("content", "Scraping failed")[:4000]
@tool(args_schema=DeepDiveInput)
def deep_dive(query: str) -> str:
# Run comprehensive research on a topic
resp = requests.get(
f"{SEARCHHIVE_BASE}/deep-dive",
headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
params={"query": query}
)
return resp.json().get("summary", "No results found")[:3000]
# All three tools ready for LangChain
search_tools = [swift_search, scrape_forge, deep_dive]
Step 4: Using Search Tools with LangChain Agents
Once your tools are defined, wire them into a LangChain agent:
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a research assistant with access to web search, page scraping, and deep research tools. Always cite your sources."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
agent = create_openai_tools_agent(llm, search_tools, prompt)
executor = AgentExecutor(agent=agent, tools=search_tools, verbose=True, max_iterations=5)
result = executor.invoke({"input": "What are the main differences between CrewAI and AutoGen?"})
print(result["output"])
Step 5: Web Search as a RAG Retriever
Web search as a retriever is a powerful RAG pattern — instead of searching a static vector store, you search the live internet:
from langchain_core.retrievers import BaseRetriever
from langchain_core.documents import Document
from typing import List
class SearchHiveRetriever(BaseRetriever):
# A retriever that uses SearchHive SwiftSearch as its source
def _get_relevant_documents(self, query: str) -> List[Document]:
resp = requests.get(
f"{SEARCHHIVE_BASE}/swift-search",
headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
params={"query": query, "limit": 5}
)
docs = []
for r in resp.json().get("results", []):
docs.append(Document(
page_content=r["snippet"],
metadata={"title": r["title"], "url": r["url"], "source": "web"}
))
return docs
retriever = SearchHiveRetriever()
# Use in a standard RAG chain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
template = "Answer the question based on these search results:\n{context}\n\nQuestion: {question}\n"
prompt = ChatPromptTemplate.from_template(template)
llm = ChatOpenAI(model="gpt-4o")
rag_chain = (
{"context": retriever, "question": lambda x: x}
| prompt
| llm
| StrOutputParser()
)
answer = rag_chain.invoke("What is the current state of AI regulation in the EU?")
print(answer)
Step 6: Adding Search to an Existing Chain
If you already have a LangChain chain and want to add web search, use RunnablePassthrough:
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
def search_and_format(query: str) -> str:
# Search the web and format results as context
resp = requests.get(
f"{SEARCHHIVE_BASE}/swift-search",
headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
params={"query": query, "limit": 3}
)
return "\n".join(
f"[{r['title']}] {r['snippet']}"
for r in resp.json().get("results", [])
)
chain = (
{"context": RunnableLambda(search_and_format), "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
Common Issues and Fixes
Rate limiting: Free tiers (DuckDuckGo, Serper) have strict rate limits. For production, use a paid tier. SearchHive's Starter plan ($9/mo) gives you 5,000 credits with generous rate limits.
Inconsistent results: DuckDuckGo's free API can return empty results or be blocked. Brave's free tier is more reliable. SearchHive uses multiple search backends for redundancy.
Slow responses: Search adds 200-500ms latency per tool call. For time-sensitive applications, cache search results or use SwiftSearch (fastest option).
Token budget: Search results can eat into your context window. Limit results to 3-5 items and truncate snippets.
Next Steps
- Set up SearchHive's free tier for production-ready search integration
- Explore the SearchHive API docs for advanced features like batch search and monitoring
- Check out /blog/openai-function-calling-with-web-search-apis for function calling patterns
- Read /compare/tavily to see how SearchHive compares to other search APIs
Related: /compare/langchain-tools | /tutorials/langchain-rag-tutorial