ScrapeForge API Reference

Interactive testing for enterprise web scraping

Enterprise Web Scraping

Test ScrapeForge's enterprise-grade scraping capabilities including JavaScript rendering, residential proxies, and comprehensive data extraction.

POST

https://www.searchhive.dev/api/v1/scrapeforge

Scrape Any Website

Enterprise web scraping with JavaScript rendering, proxy rotation, and comprehensive data extraction

Request Builder

API Key

Using demo key for testing. Replace with your actual API key.

Parameters

url

Required

The target URL to scrape. Must be a valid HTTP/HTTPS URL.

render_js

Enable render_js

Execute JavaScript on the page using Chromium browser engine.

wait_for

CSS selector or XPath to wait for before scraping.

wait_time

Maximum time to wait for elements in seconds (1-60).

user_agent

Custom User-Agent string for the request.

proxy_type

Type of proxy to use for the request.

proxy_country

Target country for proxy location (ISO 3166-1 alpha-2).

extract_links

Enable extract_links

Extract all links found on the page with metadata.

extract_images

Enable extract_images

Extract all images with URLs, alt text, and dimensions.

extract_schema

Enable extract_schema

Extract structured data (JSON-LD, Microdata, RDFa).

extract_meta

Enable extract_meta

Extract meta tags, Open Graph, and Twitter Card data.

screenshot

Enable screenshot

Capture a screenshot of the page (requires render_js=true).

custom_headers

Custom HTTP headers to include with the request.

Cookies to include with the request.

Code Examples

import requests

response = requests.post(
    "https://www.searchhive.dev/api/v1/scrapeforge",
    headers={"Authorization": "Bearer " + "demo_sk_scrapeforge_67890"},
    json={
  "url": "https://example.com/products",
  "render_js": true,
  "wait_for": "#product-list",
  "wait_time": 10,
  "proxy_type": "residential",
  "proxy_country": "US",
  "extract_links": true,
  "extract_images": false,
  "extract_schema": true,
  "extract_meta": true,
  "screenshot": false,
  "custom_headers": {},
  "cookies": {}
}
)

data = response.json()
print(data)

API Response

Click "Test API" to see the response

Response Schema

ScrapeForge Response Fields

Field	Type	Description
`content`	string	The raw HTML content of the scraped page. Example:`"<html><head>...</head><body>...</body></html>"`
`text_content`	string	Plain text content extracted from HTML. Example:`"Welcome to our product catalog..."`
`links`	array	Array of link objects with URL, text, and attributes. Example:`[{"url": "...", "text": "...", "rel": "...", "target": "..."}]`
`images`	array	Array of image objects with src, alt, and dimensions. Example:`[{"src": "...", "alt": "...", "width": 800, "height": 600}]`
`schema_data`	array	Structured data found on the page. Example:`[{"@type": "Product", "name": "...", "price": "..."}]`
`meta_data`	object	Meta tags, Open Graph, and Twitter Card data. Example:`{"title": "...", "description": "...", "og:image": "..."}`
`screenshot_url`	string	URL to the captured screenshot (if enabled). Example:`"https://cdn.searchhive.com/screenshots/abc123.png"`
`load_time`	float	Time taken to load and process the page in seconds. Example:`2.34`
`status_code`	integer	HTTP status code returned by the target server. Example:`200`
`final_url`	string	Final URL after following redirects. Example:`"https://example.com/products"`
`credits_used`	integer	Number of API credits consumed by this request. Example:`7`

Enterprise Features

JavaScript Rendering

• Chromium browser engine

• Full ES6+ support

• DOM manipulation

• AJAX/Fetch requests

Proxy Network

• 100M+ residential IPs

• 200+ countries

• Automatic rotation

• High-speed datacenters

Anti-Detection

• Browser fingerprinting

• Bot detection bypass

• CAPTCHA handling

• Behavioral mimicking

Data Extraction

• Structured data parsing

• Link extraction

• Image processing

• Meta data analysis

Bulk Scraping

POST/v1/scrapeforge/bulk

Process up to 100 URLs simultaneously with intelligent load balancing and error handling.

{
  "urls": [
    "https://site1.com/page1",
    "https://site2.com/page2",
    "https://site3.com/page3"
  ],
  "render_js": true,
  "concurrent_requests": 3,
  "retry_failed": true,
  "max_retries": 2
}

Error Codes

200 OK

Page scraped successfully

400 Bad Request

Invalid URL or parameters

403 Forbidden

Target site blocked the request

504 Gateway Timeout

Target site took too long to respond

Common Use Cases

E-commerce Data

Extract product details, prices, and inventory

{ "url": "https://shop.example.com/product/123", "render_js": true, "wait_for": ".price", "extract_schema": true, "extract_images": true }

News Articles

Extract article content and metadata

{ "url": "https://news.example.com/article/123", "extract_meta": true, "extract_links": true, "proxy_type": "residential" }

Social Media

Scrape posts, comments, and user profiles

{ "url": "https://social.example.com/profile/user", "render_js": true, "wait_for": ".posts-container", "screenshot": true, "proxy_type": "residential" }

Real Estate

Extract property listings and details

{ "url": "https://realty.example.com/listing/123", "extract_schema": true, "extract_images": true, "extract_meta": true, "proxy_country": "US" }

Credit Consumption

Base Costs

Basic scraping:

3 credits

JavaScript rendering:

+5 credits

Residential proxy:

+2 credits

Screenshot capture:

+3 credits

Extraction Features

Link extraction:

+1 credit

Image extraction:

+1 credit

Schema extraction:

+2 credits

Meta data extraction:

+1 credit