LlamaIndex Web Search Integration: Best Search APIs Compared
LlamaIndex is the leading framework for building RAG (Retrieval-Augmented Generation) applications. While LlamaIndex excels at searching your own documents, many applications need to search the web too -- for current events, competitor data, documentation, or general knowledge. This guide compares the best search APIs for LlamaIndex, with working code examples and a clear verdict.
Key Takeaways
- LlamaIndex supports web search through its
WebSearchTooland custom tool integration - SearchHive is the best fit for LlamaIndex: $49/mo for 100K credits across search, scrape, and deep research APIs
- Tavily has a native LlamaIndex integration but costs 16x more at equivalent volumes
- For research-heavy RAG apps, combining search with content extraction (DeepDive) produces dramatically better results
Why Web Search in LlamaIndex?
LlamaIndex's primary strength is indexing and querying your own data. But your documents have a cutoff date too. Web search fills that gap by:
- Providing real-time data alongside your indexed documents
- Enabling hybrid queries (search your docs AND the web)
- Powering agentic workflows where the LLM decides what to search
- Supporting research pipelines that need full page content, not just snippets
Built-in WebSearchTool (Tavily)
LlamaIndex has a native integration with Tavily through the WebSearchTool:
from llama_index.tools.tavily import TavilyToolSpec
tavily_tool = TavilyToolSpec(
api_key="your-tavily-key",
max_results=5,
)
# Use as a query engine tool
query_engine = tavily_tool.to_query_engine()
results = query_engine.query("What are the latest AI regulations in the EU?")
print(results)
This works out of the box but Tavily costs $8/1K searches. At 10K searches/month, that is $80 -- more than SearchHive's entire 100K-credit Builder plan.
Using SearchHive with LlamaIndex (Recommended)
LlamaIndex makes it easy to wrap any REST API as a tool. Here is a complete SearchHive integration:
SwiftSearch for Quick Lookups
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI
import requests
import json
SEARCHHIVE_KEY = "your-api-key-here"
def swift_search(query: str, num_results: int = 5) -> str:
# Search the web using SearchHive SwiftSearch
resp = requests.post(
"https://api.searchhive.dev/v1/swiftsearch",
headers={
"Authorization": f"Bearer {SEARCHHIVE_KEY}",
"Content-Type": "application/json",
},
json={"query": query, "num_results": num_results},
)
data = resp.json()
results = data.get("results", [])
if not results:
return "No results found."
output = []
for r in results:
output.append(
f"Title: {r.get('title', 'N/A')}\n"
f"Snippet: {r.get('snippet', 'N/A')}\n"
f"URL: {r.get('url', 'N/A')}"
)
return "\n\n".join(output)
# Create a LlamaIndex FunctionTool
search_tool = FunctionTool.from_defaults(
fn=swift_search,
name="web_search",
description="Search the web for current information, news, pricing, and facts.",
)
# Create an agent with search capability
agent = ReActAgent.from_tools(
[search_tool],
llm=OpenAI(model="gpt-4o", temperature=0),
system_prompt="You are a research assistant. Use web search for any current or factual questions.",
)
response = agent.chat("What is the current pricing for OpenAI GPT-4o API?")
print(response)
DeepDive for Full Research Content
For RAG applications that need complete page content, use DeepDive to extract full articles:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Document
import requests
SEARCHHIVE_KEY = "your-api-key-here"
def research_and_index(query: str, max_pages: int = 5):
# Step 1: Find relevant pages via SwiftSearch
resp = requests.post(
"https://api.searchhive.dev/v1/swiftsearch",
headers={
"Authorization": f"Bearer {SEARCHHIVE_KEY}",
"Content-Type": "application/json",
},
json={"query": query, "num_results": max_pages},
)
urls = [r["url"] for r in resp.json().get("results", [])]
# Step 2: Extract full content via DeepDive
documents = []
for url in urls[:max_pages]:
deep_resp = requests.post(
"https://api.searchhive.dev/v1/deepdive",
headers={
"Authorization": f"Bearer {SEARCHHIVE_KEY}",
"Content-Type": "application/json",
},
json={"url": url, "format": "markdown", "extract_text": True},
)
page_data = deep_resp.json()
content = page_data.get("content", "")
title = page_data.get("title", url)
if content:
documents.append(Document(
text=content[:5000],
metadata={"source": url, "title": title},
))
# Step 3: Build a LlamaIndex from the research
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
return query_engine, documents
# Usage
engine, docs = research_and_index("latest developments in quantum computing 2026")
result = engine.query("What are the most promising quantum computing applications?")
print(result)
This pattern combines web search with LlamaIndex's vector indexing. The AI searches the web, extracts full content, builds an index, and then queries it -- all in one pipeline.
Comparison Table
| API | Per-1K Price | LlamaIndex Native | Content Extraction | ScrapeForge Integration |
|---|---|---|---|---|
| SearchHive | ~$0.49 (Builder) | Custom tool | SwiftSearch + ScrapeForge + DeepDive | Native |
| Tavily | $8.00/1K | Yes (WebSearchTool) | Built-in snippets | No |
| SerpApi | $25.00 | Custom tool | No | No |
| Serper | $1.00/1K | Custom tool | No | No |
| Brave Search | $5.00/1K | Custom tool | No | No |
| Exa | $7.00/1K | Custom tool | $1/1K pages | No |
Pricing at Common Volumes
| Monthly Searches | SearchHive | Tavily | SerpApi | Serper |
|---|---|---|---|---|
| 1K | Free (500) + Starter $9 | Free (1K) | $25 | $50 (50K credits) |
| 10K | $9 | $80 | $200+ | $50 |
| 50K | $49 | $400 | $725 | $375 |
| 100K | $49 | $800 | $725 | $100 |
SearchHive's flat-rate Builder plan ($49/mo for 100K credits) makes it the clear winner at any scale above a few thousand searches. The credits work across search, scraping, and research -- so you get three APIs for less than the price of one competitor.
Verdict
SearchHive is the best search API for LlamaIndex because:
- Unbeatable pricing: 100K credits for $49/mo vs $800 for the same volume with Tavily
- Three APIs in one: SwiftSearch for search, ScrapeForge for JS-rendered scraping, DeepDive for full content extraction -- all from the same API key
- Research-ready: The DeepDive + LlamaIndex VectorStoreIndex pattern produces much better RAG results than snippet-only search
- Easy integration: LlamaIndex's
FunctionToolmakes any REST API a first-class tool in under 10 lines of code
Tavily's native integration is convenient for prototyping but does not justify the 16x price difference in production. Use SearchHive.
Next Steps
- Get a free SearchHive API key (500 credits, no credit card)
- Check the LlamaIndex documentation for advanced agent patterns
- See our OpenAI function calling comparison for GPT-specific search integration
- Read about Claude web access for Anthropic model integration
Get Started with SearchHive
SearchHive provides search, scraping, and deep research through a unified API. Start free with 500 credits, then upgrade to Builder ($49/mo) for 100K credits -- enough for most production LlamaIndex applications.
Sign up free and start building better RAG applications today.