Research-driven applications need access to academic papers, citations, and metadata. Whether you're building a literature review tool, an AI research assistant, or a citation graph, the quality of your academic search API determines the quality of your output.
The landscape has shifted in 2026. Google Scholar still doesn't offer an official API (third-party wrappers remain the only option). PubMed's E-utilities are free but limited. Semantic Scholar provides a well-designed, free API with AI-powered features. And new commercial options have entered the market.
This guide compares the available academic search APIs head-to-head with real pricing, feature matrices, and code examples.
Key Takeaways
- Semantic Scholar API is the best free option — 100 requests/second, AI-powered paper recommendations, and structured citation data
- PubMed E-utilities remains the gold standard for biomedical literature — free, reliable, but biomedical-only
- Google Scholar has no official API — third-party wrappers (SerpApi, SearchHive) proxy it, with varying reliability
- CrossRef API provides free DOI metadata and citation counts across all disciplines
- SearchHive SwiftSearch can extract Google Scholar results when you need broader academic coverage
1. Semantic Scholar — Best Free Academic API
Semantic Scholar, backed by the Allen Institute for AI, offers a comprehensive academic search API with natural language queries, paper recommendations, and citation graph traversal.
Pricing: Free. 100 requests/second for registered users. API key recommended but not required for low-volume use.
import requests
# Search for papers
resp = requests.get(
"https://api.semanticscholar.org/graph/v1/paper/search",
params={
"query": "transformer architecture attention mechanism",
"limit": 10,
"fields": "title,abstract,year,citationCount,authors,openAccessPdf,url"
}
)
for paper in resp.json().get("data", []):
print(f"[{paper.get('year')}] {paper['title']}")
print(f" Citations: {paper.get('citationCount', 0)}")
print(f" URL: {paper.get('url', 'N/A')}")
print()
# Get paper details and references
paper_resp = requests.get(
"https://api.semanticscholar.org/graph/v1/paper/{paper_id}",
params={
"fields": "title,abstract,references,citations,embedding,tldr"
}
)
Key features:
- Natural language paper search
- Citation graph traversal (references + citing papers)
- AI-generated TLDR summaries for papers
- Paper embeddings for similarity search
- Author and venue information
- Open access PDF links
Limitations: Coverage gaps in older papers (pre-2000), some disciplines less covered than CS/biomedicine, occasional rate limiting for unauthenticated requests.
2. PubMed E-utilities — Best for Biomedical Research
PubMed's E-utilities API provides free access to the MEDLINE database — over 36 million biomedical citations. It's been the standard for biomedical research for decades.
Pricing: Free. No API key required. Rate limit: 3 requests/second without a key, 10/second with an API key.
import requests
import xml.etree.ElementTree as ET
# Search PubMed for papers
search_resp = requests.get(
"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi",
params={
"db": "pubmed",
"term": "CRISPR gene editing clinical trial 2024:2026[dp]",
"retmax": 20,
"sort": "relevance"
}
)
ids = ET.fromstring(search_resp.text).findall(".//Id")
id_list = ",".join(i.text for i in ids)
# Fetch details for the results
fetch_resp = requests.get(
"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi",
params={"db": "pubmed", "id": id_list, "rettype": "xml"}
)
articles = ET.fromstring(fetch_resp.text).findall(".//PubmedArticle")
for article in articles:
title = article.find(".//ArticleTitle").text or "No title"
print(title)
Strengths: Comprehensive biomedical coverage, free, reliable, well-documented Weaknesses: Biomedical only, XML responses (not free JSON formatter), limited full-text access, strict rate limits
3. CrossRef API — Best for DOI Metadata and Citation Counts
CrossRef is the DOI registration agency for scholarly publications. Their API provides metadata for over 150 million scholarly records across all disciplines.
Pricing: Free. Polite pool gets higher rate limits (include your email in requests).
import requests
# Search works by title or DOI
resp = requests.get(
"https://api.crossref.org/works",
params={
"query": "attention is all you need transformer",
"rows": 5,
"select": "DOI,title,author,published,container-title,is-referenced-by-count"
},
headers={"mailto": "your@email.com"} # Higher rate limits
)
for item in resp.json().get("message", {}).get("items", []):
print(f"{item['title'][0]}")
print(f" DOI: {item.get('DOI', 'N/A')}")
print(f" Citations: {item.get('is-referenced-by-count', 0)}")
print()
Strengths: Cross-disciplinary, free, DOI resolution, citation counts, funding information Weaknesses: No abstracts, no full text, limited search relevance, metadata quality varies by publisher
4. Google Scholar via SearchHive — Broadest Coverage
Google Scholar has no official API. SearchHive's SwiftSearch can extract Google Scholar results, giving you access to its broad interdisciplinary coverage.
import requests
API_KEY = "your-searchhive-key"
# Search Google Scholar via SearchHive
resp = requests.get(
"https://api.searchhive.dev/v1/swift/search",
headers={"Authorization": f"Bearer {API_KEY}"},
params={
"query": "site:scholar.google.com transformer neural network efficiency",
"limit": 10
}
)
for result in resp.json().get("results", []):
print(f"{result['title']}")
print(f" {result['url']}")
print(f" {result.get('description', '')[:150]}")
print()
Strengths: Broadest academic coverage, familiar Google-quality results, works alongside other SearchHive APIs Weaknesses: No structured citation data, dependent on Google's HTML structure, may hit anti-bot limits at high volume
5. SerpApi Scholar — Google Scholar Wrapper
SerpApi provides a structured API for Google Scholar results, parsing titles, authors, citations, and PDF links into clean JSON.
Pricing: Scholar searches count against your SerpApi plan. $25/mo for 1K total searches (not just Scholar).
import requests
resp = requests.get(
"https://serpapi.com/search",
params={
"engine": "google_scholar",
"q": "large language models survey 2026",
"api_key": "YOUR_KEY"
}
)
for organic in resp.json().get("organic_results", []):
print(organic.get("title"))
print(f" Cited by: {organic.get('inline_links', {}).get('cited_by', {}).get('total', 0)}")
Strengths: Structured Google Scholar data, handles anti-bot automatically Weaknesses: Expensive, rate limits on high-volume academic search, no full-text access
Comparison Table
| API | Free Tier | Rate Limit | Disciplines | Full Text | Citation Data | Output Format |
|---|---|---|---|---|---|---|
| Semantic Scholar | Yes (100 req/s) | 100 req/s | All (CS strongest) | Links to OA | Full graph | JSON |
| PubMed E-utilities | Yes | 10 req/s | Biomedical | Links to PMC | Limited | XML |
| CrossRef | Yes (polite pool) | 50 req/s | All disciplines | No | Citation counts | JSON |
| SearchHive (Scholar) | 500 credits | Plan-based | All (via Google) | No | Snippets only | JSON |
| SerpApi Scholar | 250/mo | Plan-based | All (via Google) | No | Structured | JSON |
Choosing the Right Academic Search API
For AI/CS research: Start with Semantic Scholar. The API quality, citation graph, and TLDR summaries are purpose-built for this domain.
For biomedical research: PubMed is irreplaceable. Combine it with Semantic Scholar for broader context.
For cross-disciplinary research: Layer CrossRef (for DOIs and citation counts) under Semantic Scholar (for abstracts and recommendations).
When you need Google Scholar results: Use SearchHive's SwiftSearch to extract Scholar results. The site:scholar.google.com operator limits results to academic sources while using SearchHive's credit system.
For a complete research pipeline: Combine multiple APIs. Search with Semantic Scholar, validate DOIs with CrossRef, fetch biomedical details from PubMed, and use SearchHive DeepDive to extract data from paper pages.
Getting Started
Semantic Scholar requires only a free API key for high-volume use. PubMed needs nothing. SearchHive gives you 500 free credits to test Scholar search alongside scraping and extraction.
Sign up for SearchHive or explore the Semantic Scholar API docs and PubMed E-utilities guide.