How to Scrape Google Maps Data with Python
Google Maps contains data on over 200 million businesses worldwide — names, addresses, phone numbers, websites, ratings, reviews, and hours. It's the most comprehensive business directory on the internet, making it a prime target for lead generation, competitive analysis, and local SEO research.
This tutorial covers every approach to extracting Google Maps data with Python — from the official Places API to browser automation — with honest assessments of costs, limitations, and legal considerations.
Key Takeaways
- The official Places API is the only legally safe method — starting at $5/1K requests for Essentials data
- Text Search (IDs Only) is free unlimited — get place IDs first, then pay for details
- Browser automation (Playwright/Selenium) technically violates Google's ToS and is increasingly unreliable
- Third-party APIs (Outscraper, Apify) handle the complexity for $1-3/1K records
- SearchHive's ScrapeForge provides a middle ground — reliable extraction with built-in anti-bot handling
Prerequisites
- Python 3.9+
- For official API: Google Cloud account + Places API enabled + API key
- For browser automation: Playwright (
pip install playwright && playwright install chromium) - For third-party APIs: Account with Outscraper, Apify, or similar
- A SearchHive API key (free tier available)
# Official API
pip install googlemaps
# Browser automation
pip install playwright
playwright install chromium
# Third-party
pip install outscraper
# SearchHive
pip install searchhive
Step 1: Official Google Places API
The official API is the most reliable and legally safe approach. Google offers a new API with tiered pricing based on which fields you request.
Pricing Overview
| Data Tier | Cost per 1K Requests | Free Monthly Cap | Key Fields |
|---|---|---|---|
| Text Search (IDs Only) | Free | Unlimited | id, name, photos |
| Essentials | $5.00 | 10,000 | address, location, types |
| Pro | $17.00 | 5,000 | display name, business status, hours |
| Enterprise | $20.00 | 1,000 | phone, website, rating, reviews |
Important: Requesting fields from multiple tiers bills you at the highest applicable tier. Use field masks carefully.
Search for Places
import requests
import json
API_KEY = "YOUR_GOOGLE_API_KEY"
BASE_URL = "https://places.googleapis.com/v1"
# Step 1: Text Search — FREE (returns IDs only)
def search_places(query, region="us"):
"""Search for places using the free Text Search (IDs Only)."""
headers = {
"X-Goog-Api-Key": API_KEY,
"X-Goog-FieldMask": "id,displayName,formattedAddress",
"Content-Type": "application/json",
}
body = {
"textQuery": query,
"regionCode": region,
"pageSize": 20,
}
resp = requests.post(f"{BASE_URL}/places:searchText", json=body, headers=headers)
resp.raise_for_status()
return resp.json()
# Search for restaurants in NYC
results = search_places("pizza restaurants in New York City")
for place in results.get("places", []):
print(f"{place.get('displayName', {}).get('text', 'N/A')}")
print(f" ID: {place['id']}")
print(f" Address: {place.get('formattedAddress', 'N/A')}")
print()
Get Place Details
def get_place_details(place_id, fields=None):
"""Get detailed information about a place."""
if fields is None:
# Enterprise fields (phone, website, rating, reviews)
fields = (
"id,displayName,formattedAddress,nationalPhoneNumber,"
"websiteUri,rating,userRatingCount,"
"regularOpeningHours,currentOpeningHours,"
"businessStatus,primaryTypeDisplayName"
)
headers = {
"X-Goog-Api-Key": API_KEY,
"X-Goog-FieldMask": fields,
}
resp = requests.get(f"{BASE_URL}/places/{place_id}", headers=headers)
resp.raise_for_status()
return resp.json()
# Get details for each place found
place_ids = [p["id"] for p in results.get("places", [])]
for pid in place_ids[:5]:
details = get_place_details(pid)
name = details.get("displayName", {}).get("text", "N/A")
phone = details.get("nationalPhoneNumber", "N/A")
website = details.get("websiteUri", "N/A")
rating = details.get("rating", "N/A")
reviews = details.get("userRatingCount", 0)
status = details.get("businessStatus", "N/A")
print(f"{name}")
print(f" Phone: {phone}")
print(f" Website: {website}")
print(f" Rating: {rating} ({reviews} reviews)")
print(f" Status: {status}")
print()
Using the googlemaps Python Library
import googlemaps
gmaps = googlemaps.Client(key=API_KEY)
# Text search (legacy API)
places = gmaps.places("coffee shops in San Francisco")
for place in places["results"][:5]:
place_id = place["place_id"]
print(f"{place['name']}: {place.get('formatted_address', 'N/A')}")
# Place details
details = gmaps.place(
place_id,
fields=["name", "formatted_address", "international_phone_number",
"website", "rating", "user_ratings_total", "opening_hours",
"geometry", "types"]
)
print(json.dumps(details, indent=2))
# Nearby search
places_nearby = gmaps.places_nearby(
location=(37.7749, -122.4194), # San Francisco
radius=5000,
type="restaurant"
)
print(f"Found {len(places_nearby['results'])} restaurants nearby")
Step 2: Pagination and Bulk Extraction
The Places API returns max 20 results per search request. For larger datasets, use pagination:
def search_all_places(query, max_results=100):
"""Search with pagination to get more results."""
headers = {
"X-Goog-Api-Key": API_KEY,
"X-Goog-FieldMask": "id,displayName,formattedAddress",
"Content-Type": "application/json",
}
all_places = []
next_page_token = None
while len(all_places) < max_results:
body = {"textQuery": query, "pageSize": 20}
if next_page_token:
body["pageToken"] = next_page_token
resp = requests.post(f"{BASE_URL}/places:searchText", json=body, headers=headers)
resp.raise_for_status()
data = resp.json()
places = data.get("places", [])
all_places.extend(places)
next_page_token = data.get("nextPageToken")
print(f" Fetched {len(all_places)} places so far...")
if not next_page_token or not places:
break
return all_places[:max_results]
# Get up to 60 results (3 pages)
places = search_all_places("dentists in Chicago", max_results=60)
print(f"Total: {len(places)} places found")
Step 3: Third-Party APIs (Simpler, More Data)
Third-party services handle the complexity of Maps scraping — proxy rotation, CAPTCHAs, pagination — for a per-record fee.
Outscraper
from outscraper import ApiClient
client = ApiClient(api_key="YOUR_OUTSCRAPER_KEY")
# Search Google Maps
results = client.google_maps_search(
"restaurants in Austin TX",
limit=20,
fields=[
"name", "full_address", "phone", "site", "rating",
"reviews", "reviews_per_score", "type", "category",
"opening_hours", "located_in", "photos"
]
)
for place in results:
print(f"{place.get('name')}: {place.get('rating')} stars")
print(f" Phone: {place.get('phone')}")
print(f" Website: {place.get('site')}")
print(f" Reviews: {place.get('reviews')} total")
Outscraper pricing: First 500 records free, then $1-3/1K records depending on volume.
Step 4: SearchHive ScrapeForge Approach
SearchHive offers a middle ground — more capable than basic HTTP requests, simpler than browser automation, with built-in anti-bot handling.
from searchhive import ScrapeForge
scraper = ScrapeForge(api_key="your_searchhive_key")
# Scrape Google Maps search results
results = scraper.extract(
urls=["https://www.google.com/maps/search/plumbers+in+Dallas"],
renderer="playwright", # Handles JavaScript rendering
extract={
"businesses": {
"name": ".qBF1Pd",
"rating": ".MW4etd",
"category": ".fontHeadlineSmall",
"address": ".W4Efsd:last-child",
"phone": ".UsdlK",
}
}
)
for page in results:
for biz in page.get("businesses", []):
print(f"{biz.get('name', 'N/A')}")
print(f" Rating: {biz.get('rating', 'N/A')}")
print(f" Category: {biz.get('category', 'N/A')}")
SearchHive for Business Website Enrichment
After getting business listings, enrich each one by analyzing their actual website:
from searchhive import DeepDive
dd = DeepDive(api_key="your_searchhive_key")
# Analyze a business's website for competitive intelligence
analysis = dd.analyze(
url="https://example-plumber.com",
extract_features=True,
summarize=True
)
print(f"Summary: {analysis.get('summary', 'N/A')}")
print(f"Services: {analysis.get('features', [])}")
Step 5: Complete Maps Data Pipeline
Wire everything into a reusable pipeline that searches, details, and exports:
import json
import csv
import time
from datetime import datetime
class GoogleMapsPipeline:
"""Complete Google Maps data extraction pipeline."""
def __init__(self, api_key, searchhive_key=None):
self.api_key = api_key
self.searchhive_key = searchhive_key
self.results = []
def search_and_detail(self, query, max_results=60, detail_fields=None):
"""Search places and get details for each."""
print(f"Searching: {query}")
places = search_all_places(query, max_results=max_results)
print(f"Found {len(places)} places. Fetching details...")
for i, place in enumerate(places):
place_id = place["id"]
try:
details = get_place_details(place_id, fields=detail_fields)
record = {
"name": details.get("displayName", {}).get("text", ""),
"address": details.get("formattedAddress", ""),
"phone": details.get("nationalPhoneNumber", ""),
"website": details.get("websiteUri", ""),
"rating": details.get("rating"),
"review_count": details.get("userRatingCount", 0),
"business_status": details.get("businessStatus", ""),
"type": details.get("primaryTypeDisplayName", {}).get("text", ""),
"hours": self._format_hours(details.get("currentOpeningHours", {})),
"scraped_at": datetime.now().isoformat(),
}
self.results.append(record)
# Rate limit: max ~10 details/second
time.sleep(0.15)
if (i + 1) % 10 == 0:
print(f" Detailed {i + 1}/{len(places)} places")
except Exception as e:
print(f" Error fetching details for {place_id}: {e}")
print(f"Complete: {len(self.results)} places with details")
return self.results
def _format_hours(self, hours_data):
"""Format opening hours into readable string."""
if not hours_data or "periods" not in hours_data:
return "Not available"
periods = hours_data.get("periods", [])
if not periods:
return "24 hours"
# Return first day as example
p = periods[0]
open_time = p.get("open", {}).get("time", "?")
close_time = p.get("close", {}).get("time", "?")
return f"{open_time}-{close_time}"
def save_csv(self, filename="maps_data.csv"):
"""Export results to CSV."""
if not self.results:
print("No results to save")
return
fieldnames = list(self.results[0].keys())
with open(filename, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(self.results)
print(f"Saved {len(self.results)} records to {filename}")
def save_json(self, filename="maps_data.json"):
with open(filename, "w", encoding="utf-8") as f:
json.dump(self.results, f, indent=2, ensure_ascii=False)
print(f"Saved {len(self.results)} records to {filename}")
# Usage
pipeline = GoogleMapsPipeline(
api_key="YOUR_GOOGLE_API_KEY",
searchhive_key="your_searchhive_key"
)
pipeline.search_and_detail(
query="marketing agencies in Los Angeles",
max_results=40
)
pipeline.save_csv()
pipeline.save_json()
Cost Comparison
| Approach | Cost per 1K Records | Setup Effort | Legality | Data Quality |
|---|---|---|---|---|
| Official API (search + Enterprise details) | ~$20-25 | Low | Fully legal | High |
| Official API (search + Essentials details) | ~$5 | Low | Fully legal | Medium |
| Outscraper | $1-3 | Very low | Gray area | High |
| Apify Actors | ~$1.50 | Very low | Gray area | High |
| SearchHive ScrapeForge | ~$2-5 | Low | Gray area | Good |
| Browser automation (Playwright) | $0 (infra only) | High | ToS violation | Variable |
Common Issues
OVER_QUERY_LIMIT (403)
Cause: Exceeded your monthly free quota or per-second rate limit. Fix: Check your Google Cloud Console billing. Default is 200 QPS — add a 5-10ms delay between requests if hitting limits.
Missing Data Fields
Cause: Not all fields are available for all places. Fix: Handle missing fields gracefully with defaults. Some businesses don't have phone numbers, websites, or hours listed.
Pagination Returns Same Results
Cause: Google's nextPageToken can be flaky. Fix: Wait 2-3 seconds between pagination requests. Tokens expire after a few minutes.
CAPTCHAs During Browser Automation
Cause: Google detects automated browsing. Fix: Use residential proxies, playwright-stealth, random delays between actions, and rotate user agents. Consider using a third-party API instead — it's cheaper than the engineering time.
Different Results by Region
Cause: Google Maps personalizes results by location. Fix: Use the regionCode parameter in the official API. For browser automation, use proxies in the target region.
Legal Considerations
- Google Maps Platform ToS prohibits scraping maps.google.com directly
- The official API is the only authorized method for programmatic access
- Places API data cannot be cached (except place_id) or redistributed without displaying on a Google Map
- Attribution (Google logo, terms link) is required wherever data is displayed
- Reviews require author attribution (name + link to profile)
- Using third-party APIs (Outscraper, Apify) shifts technical risk but doesn't eliminate ToS concerns
- For personal/research use, risk is lower; for commercial use at scale, use the official API
Next Steps
- Start with the free Text Search — get place IDs at zero cost, then selectively fetch details
- Use field masks to control costs — only request fields you actually need
- Combine with SearchHive — enrich business listings with website analysis and competitive intelligence
- Set up monthly budgets in Google Cloud Console to prevent billing surprises
- Schedule regular scrapes for monitoring — businesses change hours, phone numbers, and close permanently
Need Maps data without the API complexity? Start with SearchHive's free tier — 100 free requests per month for web scraping and content analysis. Check the API docs for integration guides.
See also: How to build a lead generation scraper | SearchHive vs SerpApi | Web scraping best practices