ScrapeForge API Reference

Interactive testing for enterprise web scraping

Enterprise Web Scraping
Test ScrapeForge's enterprise-grade scraping capabilities including JavaScript rendering, residential proxies, and comprehensive data extraction.
POST
https://www.searchhive.dev/api/v1/scrapeforge

Scrape Any Website

Enterprise web scraping with JavaScript rendering, proxy rotation, and comprehensive data extraction

Request Builder

Using demo key for testing. Replace with your actual API key.

The target URL to scrape. Must be a valid HTTP/HTTPS URL.

Execute JavaScript on the page using Chromium browser engine.

CSS selector or XPath to wait for before scraping.

Maximum time to wait for elements in seconds (1-60).

Custom User-Agent string for the request.

Type of proxy to use for the request.

Target country for proxy location (ISO 3166-1 alpha-2).

Extract all links found on the page with metadata.

Extract all images with URLs, alt text, and dimensions.

Extract structured data (JSON-LD, Microdata, RDFa).

Extract meta tags, Open Graph, and Twitter Card data.

Capture a screenshot of the page (requires render_js=true).

Custom HTTP headers to include with the request.

Cookies to include with the request.

Code Examples
import requests

response = requests.post(
    "https://www.searchhive.dev/api/v1/scrapeforge",
    headers={"Authorization": "Bearer " + "demo_sk_scrapeforge_67890"},
    json={
  "url": "https://example.com/products",
  "render_js": true,
  "wait_for": "#product-list",
  "wait_time": 10,
  "proxy_type": "residential",
  "proxy_country": "US",
  "extract_links": true,
  "extract_images": false,
  "extract_schema": true,
  "extract_meta": true,
  "screenshot": false,
  "custom_headers": {},
  "cookies": {}
}
)

data = response.json()
print(data)
API Response

Click "Test API" to see the response

Response Schema

ScrapeForge Response Fields

FieldTypeDescription
content
string

The raw HTML content of the scraped page.

Example:"<html><head>...</head><body>...</body></html>"

text_content
string

Plain text content extracted from HTML.

Example:"Welcome to our product catalog..."

links
array

Array of link objects with URL, text, and attributes.

Example:[{"url": "...", "text": "...", "rel": "...", "target": "..."}]

images
array

Array of image objects with src, alt, and dimensions.

Example:[{"src": "...", "alt": "...", "width": 800, "height": 600}]

schema_data
array

Structured data found on the page.

Example:[{"@type": "Product", "name": "...", "price": "..."}]

meta_data
object

Meta tags, Open Graph, and Twitter Card data.

Example:{"title": "...", "description": "...", "og:image": "..."}

screenshot_url
string

URL to the captured screenshot (if enabled).

Example:"https://cdn.searchhive.com/screenshots/abc123.png"

load_time
float

Time taken to load and process the page in seconds.

Example:2.34

status_code
integer

HTTP status code returned by the target server.

Example:200

final_url
string

Final URL after following redirects.

Example:"https://example.com/products"

credits_used
integer

Number of API credits consumed by this request.

Example:7

Enterprise Features

JavaScript Rendering
• Chromium browser engine
• Full ES6+ support
• DOM manipulation
• AJAX/Fetch requests
Proxy Network
• 100M+ residential IPs
• 200+ countries
• Automatic rotation
• High-speed datacenters
Anti-Detection
• Browser fingerprinting
• Bot detection bypass
• CAPTCHA handling
• Behavioral mimicking
Data Extraction
• Structured data parsing
• Link extraction
• Image processing
• Meta data analysis

Bulk Scraping

POST/v1/scrapeforge/bulk

Process up to 100 URLs simultaneously with intelligent load balancing and error handling.

{
  "urls": [
    "https://site1.com/page1",
    "https://site2.com/page2",
    "https://site3.com/page3"
  ],
  "render_js": true,
  "concurrent_requests": 3,
  "retry_failed": true,
  "max_retries": 2
}

Error Codes

200 OK

Page scraped successfully

400 Bad Request

Invalid URL or parameters

403 Forbidden

Target site blocked the request

504 Gateway Timeout

Target site took too long to respond

Common Use Cases

E-commerce Data

Extract product details, prices, and inventory

{ "url": "https://shop.example.com/product/123", "render_js": true, "wait_for": ".price", "extract_schema": true, "extract_images": true }
News Articles

Extract article content and metadata

{ "url": "https://news.example.com/article/123", "extract_meta": true, "extract_links": true, "proxy_type": "residential" }
Social Media

Scrape posts, comments, and user profiles

{ "url": "https://social.example.com/profile/user", "render_js": true, "wait_for": ".posts-container", "screenshot": true, "proxy_type": "residential" }
Real Estate

Extract property listings and details

{ "url": "https://realty.example.com/listing/123", "extract_schema": true, "extract_images": true, "extract_meta": true, "proxy_country": "US" }

Credit Consumption

Base Costs
Basic scraping:
3 credits
JavaScript rendering:
+5 credits
Residential proxy:
+2 credits
Screenshot capture:
+3 credits
Extraction Features
Link extraction:
+1 credit
Image extraction:
+1 credit
Schema extraction:
+2 credits
Meta data extraction:
+1 credit