How to Scrape Google Search Results with Python

Scraping Google search results is one of the most common data extraction tasks. You need it for SEO monitoring, competitor analysis, keyword research, and building search-based tools. But Google aggressively blocks scrapers -- CAPTCHAs, rate limiting, and DOM changes make raw HTTP scraping unreliable within hours.

This guide covers the three realistic approaches: using a dedicated SERP API (recommended), building your own with SearchHive as infrastructure, and what to avoid.

Key Takeaways

Raw scraping Google directly is a losing battle. Google detects and blocks scrapers within minutes, and the HTML structure changes frequently.
SERP APIs are the standard approach. They handle proxy rotation, CAPTCHAs, and HTML parsing, returning structured free JSON formatter results.
SearchHive's ScrapeForge + SwiftSearch offers a cost-effective alternative to premium SERP APIs for most use cases.
For light use, free-tier SERP APIs give you 100-2,500 searches/month. For production workloads, budget $25-50/month.

Prerequisites

Python 3.8+
requests library (pip install requests)
A SearchHive API key (free signup with 500 credits)

Step 1: Understand Why Direct Scraping Fails

Before choosing an approach, understand what you're up against:

CAPTCHAs -- After a handful of requests from the same IP, Google serves a CAPTCHA page instead of results.
Rate limiting -- Google throttles requests per IP, then blocks entirely.
DOM instability -- Google frequently updates its result page structure. CSS selectors that work today break next week.
Personalization -- Results vary by location, device, and search history. Controlling these variables requires specific parameters.

A naive approach using requests and BeautifulSoup:

# DO NOT DO THIS IN PRODUCTION
import requests
from bs4 import BeautifulSoup

url = "https://www.google.com/search?q=web+scraping+api"
resp = requests.get(url, headers={"User-Agent": "Mozilla/5.0"})
soup = BeautifulSoup(resp.text, "html.parser")

# This will break. Google obfuscates class names and changes structure constantly.
for g in soup.select(".g"):
    title = g.select_one("h3").text
    link = g.select_one("a")["href"]
    print(f"{title}: {link}")

This might work for 5-10 requests, then Google serves a CAPTCHA. Not viable for anything beyond testing.

Step 2: Use SearchHive SwiftSearch API (Recommended)

SearchHive's SwiftSearch API provides Google-like search results with structured output. It handles the proxy and parsing complexity internally:

import requests

API_KEY = "your-api-key"
BASE_URL = "https://api.searchhive.dev/v1"

def search_google(query, num_results=10, country="us", language="en"):
    # Get Google-style search results via SwiftSearch API.
    # Args: query, num_results, country code, language code
    # Returns: list of search result dictionaries
    response = requests.post(
        f"{BASE_URL}/search",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={
            "query": query,
            "num_results": num_results,
            "country": country,
            "language": language
        }
    )
    
    if response.status_code == 200:
        data = response.json()
        return data.get("results", data.get("organic_results", []))
    else:
        raise Exception(f"Search failed: {response.status_code} - {response.text}")


# Basic search
results = search_google("best web scraping API 2026")
for r in results:
    print(f"{r.get('title', 'No title')}")
    print(f"  URL: {r.get('url', r.get('link', 'N/A'))}")
    print(f"  Snippet: {r.get('snippet', r.get('description', ''))[:100]}...")
    print()

Advanced: Tracking Rankings Over Time

import json
import datetime

def track_keyword_rankings(keywords, target_domain, api_key):
    # Track where a domain ranks for specific keywords.
    rankings = []
    
    for keyword in keywords:
        try:
            results = requests.post(
                f"{BASE_URL}/search",
                headers={"Authorization": f"Bearer {api_key}"},
                json={"query": keyword, "num_results": 20}
            ).json()
            
            organic = results.get("results", results.get("organic_results", []))
            
            position = None
            for i, r in enumerate(organic, 1):
                url = r.get("url", r.get("link", ""))
                if target_domain in url:
                    position = i
                    break
            
            rankings.append({
                "keyword": keyword,
                "position": position,
                "found": position is not None,
                "checked_at": datetime.datetime.now().isoformat()
            })
            
        except Exception as e:
            rankings.append({
                "keyword": keyword,
                "position": None,
                "error": str(e)
            })
    
    return rankings

# Example: track your site's rankings
data = track_keyword_rankings(
    keywords=["web scraping api", "serp api python", "google search scraper"],
    target_domain="searchhive.dev",
    api_key=API_KEY
)

for r in data:
    if r.get("found"):
        print(f"Keyword '{r['keyword']}' ranks #{r['position']}")
    else:
        print(f"Keyword '{r['keyword']}' not found in top 20")

Step 3: Use SearchHive DeepDive for SERP Analysis

When you need more than just organic results -- featured snippets, People Also Ask, knowledge panels -- use DeepDive:

def analyze_serp_features(query):
    # Extract all SERP features from a search results page.
    response = requests.post(
        f"{BASE_URL}/deepdive",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={
            "url": f"https://www.google.com/search?q={query}",
            "prompt": (
                "Extract all visible SERP features: featured snippet, "
                "People Also Ask questions, knowledge panel info, "
                "top 10 organic results with title, URL, and snippet, "
                "and any ads shown. Return as structured JSON."
            )
        }
    )
    return response.json()

serp_data = analyze_serp_features("searchhive web scraping api")
print(json.dumps(serp_data, indent=2))

Step 4: Comparing SERP API Providers

If you specifically need Google SERP data (with exact ranking positions, local pack data, etc.), here's how the main providers compare:

Provider	Free Tier	Base Price	Per-Search Cost	Rate Limits
SearchHive	500 credits	$9/mo	~$0.002/credit	Per-plan limits
SerpAPI	100 searches/mo	$50/mo (5K)	$0.01-$0.005	5-50/sec
Serper.dev	2,500 searches	$50/mo (50K)	$0.001-$0.01	Varies by plan
Brave Search API	$5 free/mo	$5/1K searches	$0.005/search	15/sec
Google Custom Search	100 queries/day	$5/1K queries	$0.005/query	100/sec
Tavily	1,000 searches/mo	$0.008/credit	$0.008/search	Varies

When to use which:

SearchHive -- Best for general-purpose search + scraping combined. One API for search, scraping, and deep extraction.
Serper.dev -- Cheapest for high-volume Google SERP specifically. Good if you only need organic results.
SerpAPI -- Most comprehensive SERP features (local, images, news, shopping). Highest price but most complete.
Brave Search API -- Independent search index. Not Google results, but good for privacy-focused tools.

Step 5: Build a Keyword Research Tool

Here's a practical example combining search and scraping:

class KeywordResearcher:
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_url = "https://api.searchhive.dev/v1"
        self.headers = {"Authorization": f"Bearer {api_key}"}
    
    def get_related_keywords(self, seed_keyword):
        # Search for a keyword and extract People Also Ask questions
        # and related searches from the SERP.
        response = requests.post(
            f"{self.base_url}/deepdive",
            headers=self.headers,
            json={
                "url": f"https://www.google.com/search?q={seed_keyword}",
                "prompt": (
                    "Extract all 'People Also Ask' questions, "
                    "'Related searches' at the bottom, and "
                    "the top 5 organic result titles and URLs."
                )
            }
        )
        return response.json()
    
    def analyze_competitor_content(self, url, focus_keyword):
        # Analyze a competitor's page for keyword optimization.
        response = requests.post(
            f"{self.base_url}/deepdive",
            headers=self.headers,
            json={
                "url": url,
                "prompt": (
                    f"Analyze this page for the keyword '{focus_keyword}'. "
                    "Extract: title tag, H1, meta description, "
                    "word count estimate, number of H2 headings, "
                    "and whether the keyword appears in the first paragraph."
                )
            }
        )
        return response.json()
    
    def research(self, keyword, analyze_top_n=3):
        # Full keyword research workflow.
        serp = self.get_related_keywords(keyword)
        
        top_urls = []
        results = serp.get("organic_results", serp.get("top_results", []))
        for r in results[:analyze_top_n]:
            url = r.get("url", r.get("link", ""))
            if url:
                analysis = self.analyze_competitor_content(url, keyword)
                top_urls.append({"url": url, "analysis": analysis})
        
        return {
            "keyword": keyword,
            "related_questions": serp.get("people_also_ask", []),
            "related_searches": serp.get("related_searches", []),
            "competitor_analysis": top_urls
        }


researcher = KeywordResearcher(API_KEY)
report = researcher.research("web scraping api python")
print(json.dumps(report, indent=2))

Common Issues

Rate limiting

Start with small batches and add delays between requests. Most APIs enforce rate limits per second, not per minute.

Inconsistent result formats

Different search engines and APIs return results in different formats. Always normalize your parsing code to handle variations:

def normalize_result(raw):
    # Normalize search results from any API format.
    return {
        "title": raw.get("title") or raw.get("name") or "",
        "url": raw.get("url") or raw.get("link") or "",
        "snippet": raw.get("snippet") or raw.get("description") or "",
    }

Google changing result structure

This is why you use an API instead of scraping directly. APIs handle the parsing layer and update their extraction logic when Google changes the DOM. With SearchHive's DeepDive, the AI-based extraction adapts to structural changes automatically.

Next Steps

Start with 500 free credits: Sign up at searchhive.dev and test SwiftSearch for your SERP monitoring needs.
Build your first keyword tracker: Use the code above to track rankings for your target keywords.
Explore the full API: Check searchhive.dev/docs for SwiftSearch, ScrapeForge, and DeepDive documentation.

How to Scrape Google Search Results with Python

AI-Powered Research

Key Takeaways

Prerequisites

Step 1: Understand Why Direct Scraping Fails

Step 2: Use SearchHive SwiftSearch API (Recommended)

Advanced: Tracking Rankings Over Time

Step 3: Use SearchHive DeepDive for SERP Analysis

Step 4: Comparing SERP API Providers

Step 5: Build a Keyword Research Tool

Common Issues

Rate limiting

Inconsistent result formats

Google changing result structure

Next Steps

Keywords

RELATED ARTICLES

How to Build a Proxy Rotator for Web Scraping with Python

How to Scrape Wikipedia Data for Knowledge Graphs

How to Scrape YouTube Data — Video Metrics and Comments

BUILD WITH SEARCHHIVE