LlamaIndex Web Search Integration — Best APIs Compared

LlamaIndex is the go-to framework for building RAG (Retrieval-Augmented Generation) pipelines. But static document indexes become stale fast. Integrating a web search API lets your LlamaIndex applications query the live internet, keeping responses current and relevant.

This article compares the best search APIs for LlamaIndex, with code examples and pricing analysis for each.

Key Takeaways

LlamaIndex provides FunctionTool for wrapping any REST API as a tool, plus ToolSpec from LlamaHub for pre-built integrations
SearchHive offers the best value for LlamaIndex — search, scraping, and deep research from one API at $9/mo for 5,000 credits
For simple RAG augmentation, a custom retriever using any search API is the lightest-weight approach
SerpApi and Serper.dev are strong for Google SERP data, but significantly more expensive

Search APIs for LlamaIndex Compared

API	Integration Method	Free Tier	Pricing per 1K	Best For
SearchHive	FunctionTool / custom retriever	500 credits	$0.18 (Starter)	All-in-one search + scrape + research
SerpApi	LlamaHub ToolSpec	250/mo	$25	Google SERP data
Serper.dev	FunctionTool	2,500 queries	$50	Fast Google results
Tavily	LlamaHub ToolSpec	1K credits/mo	$8	AI-optimized search
Brave Search	FunctionTool	$5/mo credit	$5	Privacy-focused results
Exa.ai	LlamaHub ToolSpec	1K/mo	$7	Neural/semantic search
DuckDuckGo	Built-in (free)	Unlimited	Free	Quick prototyping

Approach 1: Custom Retriever (Lightest Weight)

For RAG pipelines, the cleanest integration is a custom retriever that queries a search API and returns results as LlamaIndex Document objects:

import requests
from llama_index.core.retrievers import BaseRetriever
from llama_index.core import QueryBundle
from llama_index.core.schema import Document, NodeWithScore
from typing import List

SEARCHHIVE_KEY = "your-searchhive-key"

class WebSearchRetriever(BaseRetriever):
    # Retrieve documents from the live web using SearchHive SwiftSearch

    def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
        query_str = query_bundle.query_str
        resp = requests.get(
            "https://api.searchhive.dev/v1/swift-search",
            headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
            params={"query": query_str, "limit": 5}
        )
        nodes = []
        for r in resp.json().get("results", []):
            doc = Document(
                text=r["snippet"],
                metadata={"title": r["title"], "url": r["url"], "source": "web_search"}
            )
            nodes.append(NodeWithScore(node=doc, score=1.0))
        return nodes

retriever = WebSearchRetriever()

# Use as a retriever in your RAG pipeline
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine(retriever=retriever)
response = query_engine.query("What are the latest features in LlamaIndex 0.12?")
print(response)

Approach 2: FunctionTool (Agent Integration)

For LlamaIndex agent workflows, wrap the search API as a FunctionTool:

import requests
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

SEARCHHIVE_KEY = "your-searchhive-key"

def swift_search(query: str) -> str:
    # Search the web for current information
    resp = requests.get(
        "https://api.searchhive.dev/v1/swift-search",
        headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
        params={"query": query, "limit": 5}
    )
    results = []
    for r in resp.json().get("results", []):
        results.append(f"{r['title']}\n  {r['url']}\n  {r['snippet']}")
    return "\n\n".join(results)

def scrape_forge(url: str) -> str:
    # Extract clean content from a web page as markdown
    resp = requests.post(
        "https://api.searchhive.dev/v1/scrape-forge",
        headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}", "Content-Type": "application/json"},
        json={"url": url, "format": "markdown"}
    )
    return resp.json().get("content", "Failed to scrape")[:4000]

def deep_dive(query: str) -> str:
    # Run comprehensive research with multi-source synthesis
    resp = requests.get(
        "https://api.searchhive.dev/v1/deep-dive",
        headers={"Authorization": f"Bearer {SEARCHHIVE_KEY}"},
        params={"query": query}
    )
    return resp.json().get("summary", "No results")[:3000]

# Create FunctionTool instances
search_tool = FunctionTool.from_defaults(
    fn=swift_search,
    name="web_search",
    description="Search the web for current information. Use for factual lookups, pricing, news."
)

scrape_tool = FunctionTool.from_defaults(
    fn=scrape_forge,
    name="page_scraper",
    description="Scrape a web page's content as markdown. Provide the full URL."
)

research_tool = FunctionTool.from_defaults(
    fn=deep_dive,
    name="deep_research",
    description="Conduct in-depth research on a topic with synthesized multi-source analysis."
)

# Build an agent with all three tools
agent = ReActAgent.from_tools(
    [search_tool, scrape_tool, research_tool],
    llm=OpenAI(model="gpt-4o"),
    verbose=True
)

response = agent.chat("Compare the pricing of SearchHive vs SerpApi vs Tavily for 50K searches/month")
print(response)

Approach 3: Using LlamaHub ToolSpecs

LlamaHub provides pre-built integrations for some search APIs:

# Tavily via LlamaHub
from llama_index.tools.tavily import TavilyToolSpec

tavily_spec = TavilyToolSpec(api_key="your-tavily-key")
tavily_tools = tavily_spec.to_tool_list()
# Returns: tavily_search, tavily_extract, tavily_crawl

# DuckDuckGo via LlamaHub
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec
ddg_spec = DuckDuckGoSearchToolSpec()
ddg_tools = ddg_spec.to_tool_list()

Note: LlamaHub doesn't have a built-in SearchHive ToolSpec yet, but the FunctionTool approach above takes just a few lines of code and gives you more control over parameters and response formatting.

Pricing Comparison for LlamaIndex Projects

Real-world costs for common LlamaIndex use cases:

Use Case	Monthly Queries	Serper.dev	Tavily	Brave	SearchHive
Personal RAG	1,000	Free tier	Free tier	Free tier	Free tier
Startup prototype	5,000	$50	$40	$25	$9
Production app	25,000	$250	$200	$125	$49
Enterprise RAG	100,000	$1,000+	$800+	$500	$199

SearchHive's credit system is especially efficient for RAG workflows where you mix search queries with page scraping. Each SwiftSearch costs 1 credit, each ScrapeForge costs 1 credit — the $49/month Builder plan gives you 100K credits, enough for a mix of searches and scrapes that would cost $200+ with separate providers.

Feature-by-Feature Comparison

Feature	SearchHive	SerpApi	Tavily	Serper
Web search	SwiftSearch (multi-engine)	Google, Bing, etc.	AI-optimized	Google
Page scraping	ScrapeForge	No	Extract tool	No
Deep research	DeepDive	No	No	No
Custom retriever	Yes	Yes	Yes	Yes
LlamaHub integration	FunctionTool	ToolSpec	ToolSpec	FunctionTool
Rate limits (starter)	5K/mo	1K/mo	1K/mo	50K (one-time)
Response format	free JSON formatter + markdown	JSON	JSON	JSON

Verdict

For LlamaIndex web search integration:

Quick prototyping: DuckDuckGo via LlamaHub — zero cost, zero setup
Google SERP data: SerpApi — most comprehensive search engine coverage
Production RAG with mixed workloads: SearchHive — search + scraping + deep research from one API key, at a fraction of the cost of running multiple providers

The biggest advantage of SearchHive for LlamaIndex is the ability to search, scrape pages for full content, and run deep research all from the same credit pool. Most LlamaIndex RAG pipelines need both search (to find relevant pages) and scraping (to extract content for the LLM context). With other providers, that requires two separate API subscriptions.

Get started with SearchHive's free tier — 500 credits, no credit card. See the API docs for complete integration guides.

LlamaIndex Web Search Integration — Best APIs Compared

AI-Powered Research

LlamaIndex Web Search Integration — Best APIs Compared

Key Takeaways

Search APIs for LlamaIndex Compared

Approach 1: Custom Retriever (Lightest Weight)

Approach 2: FunctionTool (Agent Integration)

Approach 3: Using LlamaHub ToolSpecs

Pricing Comparison for LlamaIndex Projects

Feature-by-Feature Comparison

Verdict

Keywords

RELATED ARTICLES

Dify Web Search Integration — Add Search to Your AI Workflows

How to Give Claude Web Access — Search API Integration Guide

How to Integrate Web Search into LangChain — Complete Guide

BUILD WITH SEARCHHIVE