LinkedIn is the richest B2B data source on the internet — 1 billion members, detailed professional profiles, company pages, job postings, and engagement metrics. That makes it a goldmine for lead generation, sales intelligence, and recruiting. It also makes it one of the hardest platforms to scrape, with aggressive anti-bot systems and legal action against scrapers.
This post covers the tools that actually work for LinkedIn data extraction in 2026, focusing on what data they can access, pricing, and compliance considerations.
Key Takeaways
- Bright Data offers the most reliable LinkedIn datasets through their pre-collected data products — no real-time scraping, but legally defensible
- Apify has the most mature LinkedIn scraping actors (LinkedIn Profile Scraper, Company Scraper, Search Scraper) with active maintenance
- SearchHive ScrapeForge can scrape public LinkedIn profiles when combined with its anti-bot bypass — cheaper but requires more care
- Phantombuster is the only tool with a full browser automation approach that operates through official LinkedIn sessions
- Legal risk is real — hiQ v. LinkedIn established some protections for scraping public data, but LinkedIn actively enforces their terms
The Legal Landscape (Brief)
The 2022 hiQ v. LinkedIn decision established that scraping publicly available data doesn't violate the Computer Fraud and Abuse Act. LinkedIn's own terms of service still prohibit scraping, and they actively send cease-and-desist letters and block scrapers.
The practical implication: scraping public profile pages is legally defensible in the US (per hiQ), but scraping behind-login data (messages, connections, InMail) is not. Using pre-built datasets from licensed providers (Bright Data, Oxylabs) carries less legal risk than scraping yourself.
1. Bright Data LinkedIn Dataset
/blog/geolocation-scraping-apis-localized-data-collection-compared
Bright Data offers pre-collected LinkedIn datasets rather than real-time scraping. They maintain a continuously updated database of LinkedIn profiles and company pages.
Available data:
- Profile data: name, title, company, location, education, skills, experience
- Company data: name, industry, size, revenue, headquarters, specialties
- Updated on a rolling basis (frequency varies by tier)
from brightdata import BrightData
client = BrightData(token="your_token")
# Access LinkedIn datasets
dataset = client.datasets.get("linkedin_profiles", filters={
"industry": "technology",
"location": "United States",
"job_title": "software engineer"
})
for profile in dataset:
print(f"{profile['name']} — {profile['title']} at {profile['company']}")
Pricing: Pay-per-record or subscription. Typically $0.01-0.10 per profile depending on data depth and volume. Contact sales for custom pricing.
Pros: Legally sourced, continuously updated, no scraping risk. Cons: Not real-time, limited to what's in their current dataset, expensive at scale.
2. Apify LinkedIn Actors
Apify has the most comprehensive collection of LinkedIn scraping actors on the market. They're community-maintained, regularly updated, and cover the full range of LinkedIn data types.
Key actors:
- LinkedIn Profile Scraper — Extracts full profile data (name, title, experience, education, skills, certifications)
- LinkedIn Company Scraper — Company page data (size, industry, specialties, posts)
- LinkedIn Search Scraper — Searches LinkedIn for profiles matching criteria
- LinkedIn Posts Scraper — Extracts posts and engagement metrics from profiles or companies
from apify_client import ApifyClient
client = ApifyClient("your_token")
# Scrape a LinkedIn profile
run = client.actor("apify/linkedin-profile-scraper").call(run_input={
"profileUrls": [
"https://www.linkedin.com/in/example-profile/",
"https://www.linkedin.com/in/another-profile/"
],
"startUrls": [{"url": "https://www.linkedin.com/in/example-profile/"}]
})
dataset = client.dataset(run["defaultDatasetId"])
for item in dataset.iterate_items():
print(f"{item['fullName']} — {item['title']}")
print(f" Company: {item.get('currentCompany', {}).get('name', 'N/A')}")
print(f" Location: {item.get('location', 'N/A')}")
Pricing: $49/mo (Starter, 50 compute units). A LinkedIn profile scrape typically uses 0.1-0.5 CU, so $49/mo gets you roughly 100-500 profiles.
Pros: Mature ecosystem, active maintenance, covers all LinkedIn page types. Cons: Opaque CU pricing, actors can break when LinkedIn changes their HTML, requires Apify account.
3. SearchHive ScrapeForge
SearchHive's ScrapeForge can be used to scrape public LinkedIn profile pages through its anti-bot bypass and residential proxy infrastructure.
import requests
headers = {"Authorization": f"Bearer {API_KEY}"}
# Scrape a public LinkedIn profile
resp = requests.post(
"https://api.searchhive.dev/v1/scrapeforge",
headers=headers,
json={
"url": "https://www.linkedin.com/in/public-profile/",
"output_format": "markdown",
"js_render": True,
"country": "us"
}
)
data = resp.json()
print(data["markdown"])
# Clean markdown of the profile page
For structured extraction from LinkedIn profiles, combine with DeepDive:
resp = requests.post(
"https://api.searchhive.dev/v1/deepdive",
headers=headers,
json={
"url": "https://www.linkedin.com/in/public-profile/",
"extract": {
"name": "string",
"title": "string",
"company": "string",
"location": "string",
"experience": "array"
}
}
)
profile = resp.json()
print(profile)
# {"name": "Jane Doe", "title": "Senior Engineer", ...}
Pricing: $9/mo (Starter, 5K credits) to $49/mo (Builder, 100K credits). A LinkedIn profile scrape costs roughly 2-5 credits depending on JS rendering complexity.
Pros: Cheapest option per profile, combined with search and extraction. Cons: Requires careful handling — LinkedIn aggressively blocks scrapers, and success rates vary. Not a dedicated LinkedIn tool.
4. Phantombuster
Phantombuster takes a different approach — it automates real LinkedIn browser sessions. Instead of scraping LinkedIn's website, it controls a browser that's logged into a real LinkedIn account.
Available actions:
- Profile scraping (your connections, search results)
- Auto-connect and follow
- Message sending
- Post liking and commenting
- Company page data extraction
# Phantombuster uses a visual workflow builder, not a Python SDK
# But they have a REST API for triggering workflows
import requests
resp = requests.post(
"https://api.phantombuster.com/api/v2/agents/launch",
headers={"X-Phantombuster-Key": "your_key"},
json={
"agentId": "linkedin-profile-scraper",
"argument": {"profileUrl": "https://linkedin.com/in/target/"}
}
)
Pricing: $69/mo (Starter, 2 slots) to $360/mo (Enterprise, 20 slots).
Pros: Operates through real sessions, can access behind-login data. Cons: Requires a LinkedIn account, risks account bans, no Python SDK (REST API only), limited concurrent operations.
5. Outscraper
Outscraper provides a LinkedIn data API that returns structured profile and company data.
import requests
resp = requests.get(
"https://api.outscraper.com/linked-in/profiles",
params={
"api_key": "your_key",
"query": "software engineer San Francisco",
"limit": 10
}
)
for profile in resp.json():
print(f"{profile['name']} — {profile['title']}")
Pricing: Pay-as-you-go. Approximately $0.10-0.50 per profile depending on data depth.
Pros: Simple API, no compute unit complexity. Cons: Per-profile pricing gets expensive quickly, limited to pre-scraped data.
Comparison Table
| Tool | Data Types | Real-Time | Python SDK | Per-Profile Cost | Legal Risk |
|---|---|---|---|---|---|
| Bright Data | Profiles, companies | No (datasets) | Official | $0.01-0.10 | Low |
| Apify | All LinkedIn pages | Yes | Official | ~$0.10-0.50 | Medium |
| SearchHive | Public profiles | Yes | Yes | ~$0.001-0.005 | Medium |
| Phantombuster | All (via session) | Yes | REST only | ~$0.35-1.00 | High |
| Outscraper | Profiles, companies | No | No | $0.10-0.50 | Low |
Recommendation
For enterprise lead generation: Bright Data's LinkedIn datasets. Pre-collected, legally sourced, continuously updated. Higher per-record cost but zero scraping risk and zero maintenance burden.
For startup/solo developer lead gen: Apify's LinkedIn actors provide the best balance of capability, maintenance, and pricing. The ecosystem is mature, the actors are well-maintained, and the Python SDK is solid.
For budget scraping with AI extraction: SearchHive ScrapeForge + DeepDive gives you the lowest per-profile cost with built-in structured extraction. Pair it with a delay between requests and rotate user agents to maintain success rates.
For automated outreach: Phantombuster is the only option that can both scrape and take action (connect, message). But the account ban risk is real, and you need dedicated LinkedIn accounts.
Whatever tool you choose, respect rate limits, avoid scraping behind-login pages unless you have explicit permission, and consult legal counsel for high-volume commercial use.
Get started with SearchHive — 500 free credits, scrape public profiles with anti-bot bypass built in.
Updated April 2026. This post is for informational purposes and does not constitute legal advice.