Complete Guide to Automation for Competitive Analysis
Competitive analysis is one of those tasks everyone knows they should do but nobody wants to spend hours on. Manually checking competitor websites, tracking pricing changes, monitoring feature launches, and analyzing market positioning is tedious, error-prone, and impossible to scale.
Automating competitive analysis changes the equation entirely. Instead of spending a day every quarter researching competitors, you build a system that runs continuously and alerts you to changes as they happen.
Background
Most companies do competitive analysis the same way: a spreadsheet, a quarterly meeting, and someone's best guess at what competitors are up to. This approach misses real-time changes -- a competitor drops their price by 30%, launches a new feature, or pivots their messaging, and you don't find out until the next quarterly review.
The problem isn't lack of data. The web is full of competitive intelligence -- pricing pages, job postings, customer reviews, patent filings, press releases, and product changelogs. The problem is extracting that data at scale and turning it into actionable insights.
Challenge: Why Manual Analysis Fails
- Too many sources -- each competitor has a website, blog, social media, job board, review sites, and documentation
- Changes happen constantly -- pricing pages update, features ship, positioning shifts
- Data is unstructured -- pricing is in different formats, features use different names, positioning is buried in marketing copy
- Humans are slow -- by the time you compile a competitive analysis, it's already outdated
Solution: Automated Pipeline with SearchHive
The core idea: build a pipeline that periodically fetches competitive data, normalizes it, and surfaces changes. SearchHive provides the three APIs you need:
- SwiftSearch to discover competitive content (news, reviews, mentions)
- ScrapeForge to extract structured data from competitor pages (pricing, features)
- DeepDive to research specific questions and synthesize from multiple sources
Here's a complete implementation:
Step 1: Define Your Competitors and Data Points
competitors = {
"vercel": {
"pricing_url": "https://vercel.com/pricing",
"features_url": "https://vercel.com/features",
"changelog_url": "https://vercel.com/changelog",
"search_queries": ["Vercel pricing", "Vercel new features", "Vercel vs"]
},
"netlify": {
"pricing_url": "https://www.netlify.com/pricing/",
"features_url": "https://www.netlify.com/features/",
"changelog_url": "https://www.netlify.com/changelog/",
"search_queries": ["Netlify pricing", "Netlify new features", "Netlify vs"]
},
"cloudflare-pages": {
"pricing_url": "https://www.cloudflare.com/plans/",
"features_url": "https://developers.cloudflare.com/pages/",
"changelog_url": "https://blog.cloudflare.com/",
"search_queries": ["Cloudflare Pages pricing", "Cloudflare Pages features"]
}
}
Step 2: Scrape Competitor Pricing Pages
import requests
import hashlib
import json
from datetime import datetime
API_KEY = "your-searchhive-key"
BASE = "https://api.searchhive.dev/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
def scrape_competitor_page(url, competitor_name):
"""Scrape a competitor page and return content with metadata."""
resp = requests.post(f"{BASE}/scrapeforge", headers=HEADERS, json={
"url": url,
"format": "markdown"
})
content = resp.json().get("content", "")
content_hash = hashlib.sha256(content.encode()).hexdigest()
return {
"competitor": competitor_name,
"url": url,
"content": content,
"content_hash": content_hash,
"scraped_at": datetime.utcnow().isoformat(),
"content_length": len(content)
}
# Scrape all pricing pages
for name, data in competitors.items():
result = scrape_competitor_page(data["pricing_url"], name)
print(f"{name}: {result['content_length']} chars, hash={result['content_hash'][:12]}")
Step 3: Detect Changes
Store the content hash each time you scrape. When the hash changes, the page was updated.
import os
STATE_FILE = "competitive_state.json"
def load_state():
if os.path.exists(STATE_FILE):
with open(STATE_FILE) as f:
return json.load(f)
return {}
def save_state(state):
with open(STATE_FILE, "w") as f:
json.dump(state, f, indent=2)
def check_for_changes():
"""Scrape all competitor pages and detect changes."""
state = load_state()
changes = []
for name, data in competitors.items():
for url_type in ["pricing_url", "features_url"]:
url = data[url_type]
key = f"{name}:{url_type}"
result = scrape_competitor_page(url, name)
if key in state:
if state[key]["content_hash"] != result["content_hash"]:
changes.append({
"type": "page_change",
"competitor": name,
"url_type": url_type,
"url": url,
"previous_hash": state[key]["content_hash"],
"new_hash": result["content_hash"],
"detected_at": datetime.utcnow().isoformat()
})
state[key] = {
"content_hash": result["content_hash"],
"last_scraped": result["scraped_at"]
}
save_state(state)
return changes
Step 4: Monitor Competitor Mentions
Use SwiftSearch to track when competitors appear in the news, on review sites, or in comparison articles.
def monitor_mentions(competitor_name):
"""Search for recent mentions of a competitor."""
resp = requests.get(f"{BASE}/swiftsearch", headers={
"Authorization": f"Bearer {API_KEY}"
}, params={
"q": f"{competitor_name} pricing OR features OR launch OR update 2025",
"engine": "google",
"num": 10,
"tbs": "qdr:w" # Past week
})
results = resp.json().get("organic", [])
return [
{
"title": r["title"],
"url": r["url"],
"snippet": r.get("snippet", ""),
"competitor": competitor_name
}
for r in results
]
Step 5: Deep Research on Specific Questions
When you need a thorough analysis of a specific competitive topic, use DeepDive to synthesize information from multiple sources.
def competitive_deep_research(question):
"""Generate a comprehensive competitive analysis report."""
resp = requests.post(f"{BASE}/deepdive", headers=HEADERS, json={
"query": question,
"max_results": 15
})
data = resp.json()
return {
"question": question,
"summary": data.get("summary", ""),
"sources": data.get("sources", []),
"generated_at": datetime.utcnow().isoformat()
}
# Example: Research a specific competitive question
report = competitive_deep_research(
"How does Vercel Edge Functions compare to Cloudflare Workers in terms of "
"pricing, cold start latency, and supported runtimes?"
)
print(report["summary"])
Step 6: Put It All Together
def run_competitive_analysis():
"""Run the full competitive analysis pipeline."""
print("=== Competitive Analysis Report ===")
print(f"Generated: {datetime.utcnow().strftime('%Y-%m-%d %H:%M UTC')}\n")
# 1. Check for page changes
print("--- Page Changes ---")
changes = check_for_changes()
if changes:
for c in changes:
print(f" CHANGE: {c['competitor']} {c['url_type']} updated")
print(f" URL: {c['url']}")
else:
print(" No changes detected since last run.")
print()
# 2. Monitor mentions
print("--- Recent Mentions ---")
for name in competitors:
mentions = monitor_mentions(name)
if mentions:
print(f" {name.upper()}:")
for m in mentions[:3]:
print(f" - {m['title']} ({m['url']})")
print()
# 3. Generate analysis
print("--- Strategic Analysis ---")
analysis = competitive_deep_research(
"Compare the current pricing and positioning of Vercel, Netlify, "
"and Cloudflare Pages for developer deployment platforms in 2025"
)
print(analysis["summary"])
if __name__ == "__main__":
run_competitive_analysis()
Implementation Tips
Run on a schedule. Set up a cron expression generator or GitHub Action to run this daily or weekly. Use the state file to detect changes and only alert when something new is found.
Store historical data. Beyond just the hash, store the actual scraped content in a database (SQLite works fine). This lets you compare pricing across time, track feature additions, and build trend charts.
Add alerting. When a change is detected, send a Slack message, email, or webhook notification. Don't wait for someone to check the report.
Normalize pricing data. Pricing pages are messy. Write simple parsers to extract plan names, prices, and features from the markdown output. Even basic regex tester patterns work better than manual checking.
Combine with review sites. Scrape G2, Capterra, and Product Hunt for user sentiment about competitors. This data is gold for positioning.
Results
A well-built competitive analysis automation system delivers:
- Daily change detection -- know within 24 hours when a competitor updates pricing or features
- Weekly intelligence reports -- synthesized summaries of competitive landscape
- Historical trend data -- track how competitor positioning evolves over months
- Alert-driven workflow -- get notified immediately about important changes, not at the next quarterly meeting
Lessons Learned
-
Start with 2-3 competitors. Don't try to monitor 20 companies from day one. Build the pipeline for 2-3, verify it works, then expand.
-
Content hashes are your friend. They make change detection trivial and don't require storing full page content for comparison.
-
SearchHive's unified API eliminates integration headaches. No need for separate search, scraping, and analysis tools. One API key, one billing relationship, one SDK.
-
Markdown output is easier to parse than HTML. SearchHive's ScrapeForge returns clean markdown, which is far easier to extract data from than raw HTML with nested divs.
-
DeepDive is surprisingly good for competitive research. It synthesizes from multiple sources and catches things you'd miss doing manual searches.
Get started building your competitive intelligence pipeline with 500 free credits at searchhive.dev/pricing. No credit card required. Check the docs for API reference. See also /blog/best-serp-api-competitors and /blog/how-to-build-competitive-intelligence-dashboard.