LlamaIndex has become the go-to framework for building LLM applications with retrieval-augmented generation (RAG). But RAG is only as good as the data sources feeding it -- and for real-time, up-to-date information, you need a web search tool that integrates cleanly with LlamaIndex's QueryEngineTool and RouterQueryEngine patterns.
This guide reviews the best web search tools for LlamaIndex in 2025, comparing them on integration quality, pricing, response speed, and data richness.
Key Takeaways
- LlamaIndex's
ToolSpecinterface makes it straightforward to wrap any search API as a query tool - SearchHive offers the best combination of search + scraping + extraction for LlamaIndex RAG pipelines
- Serper.dev and Tavily are the most common LlamaIndex integrations but have trade-offs
- Pricing varies 10x between providers -- your choice matters at scale
What Makes a Good Search Tool for LlamaIndex?
Before diving into individual tools, here is what matters when choosing a web search tool for LlamaIndex:
- Clean free JSON formatter responses -- LlamaIndex tools need structured data, not raw HTML
- Low latency -- Users will not wait 10 seconds for a RAG answer
- Content retrieval -- Search results alone are not enough; you need page content for RAG
- Python SDK -- First-class Python support reduces integration code
- Pricing at scale -- Agent applications can make thousands of search calls per day
- Async support -- LlamaIndex works best with async tools for concurrent retrieval
Top Web Search Tools for LlamaIndex
1. SearchHive
SearchHive is a unified search and scraping API that provides SwiftSearch (search engine results), ScrapeForge (page content extraction), and DeepDive (structured data extraction) through a single API.
Why it works well with LlamaIndex:
- Returns clean JSON with titles, snippets, and URLs
- Built-in content extraction eliminates the need for a separate scraper
- Free tier with 500 credits for testing
- Pricing starts at $9/5K credits -- significantly cheaper than most alternatives
from llama_index.core.tools import FunctionTool
import requests
API_KEY = "your-searchhive-api-key"
def web_search(query: str) -> str:
"""Search the web and return results with content summaries."""
resp = requests.post(
"https://api.searchhive.dev/v1/swift/search",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"query": query, "limit": 5}
)
results = resp.json().get("results", [])
output = []
for r in results:
output.append(f"Title: {r['title']}")
output.append(f"URL: {r['url']}")
output.append(f"Snippet: {r.get('snippet', 'N/A')}")
output.append("---")
return "\n".join(output)
def scrape_content(url: str) -> str:
"""Extract full content from a web page for RAG."""
resp = requests.post(
"https://api.searchhive.dev/v1/scrape",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"url": url, "format": "markdown"}
)
return resp.json().get("content", "Failed to extract content")
search_tool = FunctionTool.from_defaults(fn=web_search, name="web_search")
scrape_tool = FunctionTool.from_defaults(fn=scrape_content, name="scrape_content")
Pricing: Free (500 credits), Starter $9/5K, Builder $49/100K, Unicorn $199/500K
2. Tavily
Tavily is built specifically for AI agents and has an official LlamaIndex integration (llama-index-tools-tavily).
Strengths: Purpose-built for AI/LLM use cases. Returns cleaned, relevant content. Official LlamaIndex integration.
Weaknesses: Pay-as-you-go at $0.008/credit adds up fast at scale. 1K free credits/month is limiting for active development. No built-in scraping -- you get search results and snippets, not full page content.
Pricing: Free (1K/mo), Pay-as-you-go ($0.008/credit), Enterprise custom.
3. Serper.dev
Serper.dev provides Google SERP data with a clean REST API. It is one of the most popular search tools in the LlamaIndex ecosystem.
Strengths: Very fast (1-2s response). Clean JSON. Cheap at volume ($0.50/1K at scale). Supports multiple search types (images, news, maps, scholar).
Weaknesses: Only returns SERP data (titles, snippets, URLs) -- no page content. You need a separate scraper to get full text for RAG. Credits expire after 6 months.
Pricing: 2,500 free on signup, then $50/50K ($1/1K), $375/500K ($0.75/1K), $1,250/2.5M ($0.50/1K).
4. Brave Search API
Brave operates its own web index (30B+ pages), making it the only independent search API at scale.
Strengths: Independent index. Privacy-focused. Good for LLM grounding. $5 free credits/month.
Weaknesses: Expensive at $5/1K search requests. Separate pricing for Answers API ($4/1K). Limited to Brave's index -- smaller than Google. Only 50 QPS.
Pricing: $5/1K search, $4/1K answers, $5 free credits/month.
5. Exa (formerly Metaphor)
Exa uses neural search to find semantically relevant pages rather than keyword matching.
Strengths: Excellent for research-oriented queries. Returns content excerpts. Low latency options (180ms). Good for finding niche sources.
Weaknesses: Expensive at $7/1K for search, $12/1K for deep search. Neural search can miss exact-match queries. Smaller index than Google/Bing.
Pricing: Free (1K/mo), Search $7/1K, Deep Search $12/1K, Contents $1/1K pages, Enterprise custom.
6. SerpApi
SerpApi is the oldest and most established SERP API provider, parsing results from Google, Bing, YouTube, and more.
Strengths: Parsed, structured results from multiple engines. Mature Python client. U.S. Legal Shield on production plans.
Weaknesses: Pricing escalates steeply. 50K searches/mo costs $275. No scraping or content extraction. Free tier limited to 250 searches/mo.
Pricing: Free (250/mo), Starter $25/1K, Developer $75/5K, Production $150/15K, Big Data $275/30K.
7. Google Custom Search JSON API
Google's official search API.
Strengths: Direct access to Google results. Simple REST API. Well-documented.
Weaknesses: Being deprecated. Closed to new customers since 2025. Existing customers have until January 2027 to migrate. Do not start new projects with this.
Pricing: $5/1K queries (standard), free tier of 100 queries/day.
8. DuckDuckGo (via duckduckgo-search library)
Open-source Python library that scrapes DuckDuckGo results without an API key.
Strengths: Free. No API key needed. Works out of the box.
Weaknesses: Unofficial -- can break without notice. Rate limited. No support. No structured content extraction. Not suitable for production.
Pricing: Free (unofficial, unsupported).
Comparison Table
| Tool | Search Price | Content Extraction | LlamaIndex Integration | Free Tier |
|---|---|---|---|---|
| SearchHive | From $0.0018/req | Built-in (ScrapeForge) | Custom FunctionTool | 500 credits |
| Tavily | $0.008/credit | Snippets only | Official | 1K/mo |
| Serper.dev | From $0.50/1K | None | Community | 2,500 signup |
| Brave Search | $5/1K | None | Community | $5/mo credits |
| Exa | $7/1K | Built-in (Contents) | Community | 1K/mo |
| SerpApi | From $25/1K | None | Community | 250/mo |
| Google CSE | $5/1K | None | Native | 100/day |
| DuckDuckGo | Free | None | Manual | Free |
Recommendation
For most LlamaIndex RAG applications, SearchHive is the strongest choice because it combines search, scraping, and structured extraction in a single API. LlamaIndex RAG pipelines need not just search results but actual page content -- and most search APIs force you to use a separate tool for that.
Choose SearchHive if: You need search + content extraction, want to minimize API costs, or are building a production RAG application.
Choose Tavily if: You are building simple AI agents that only need search snippets, and want an official LlamaIndex package.
Choose Serper.dev if: You only need SERP data and want the cheapest Google search results at scale.
Choose Exa if: Semantic search quality matters more than cost (research, academic applications).
Get Started
SearchHive offers 500 free credits so you can test your LlamaIndex integration before committing. The FunctionTool pattern shown above takes less than 20 lines of code to set up.