API Webhooks Design: Common Questions Answered
Webhooks are the backbone of modern API integrations -- they let your system react to events in real-time instead of polling endlessly. This FAQ covers the most common questions about designing, implementing, and maintaining webhook systems.
Common Questions
What is a webhook and how does it differ from polling?
A webhook is an HTTP callback that fires when a specific event occurs. Instead of your client repeatedly asking "did anything happen?" (polling), the server pushes a notification to your endpoint when an event triggers.
Polling:
Client -> Server: "Any updates?" (every 30 seconds)
Server -> Client: "No."
Client -> Server: "Any updates?" (30 seconds later)
Server -> Client: "Yes! Here's the data."
Webhook:
Client -> Server: "Send updates to https://yourapp.com/webhook"
(Server waits for event to occur)
Server -> Client: POST https://yourapp.com/webhook {"event": "completed", "data": {...}}
Webhooks are more efficient. No wasted requests, instant delivery, and less server load on both sides.
How do I design the webhook payload?
Keep payloads self-contained. The receiver should be able to process the event without making additional API calls:
# Good webhook payload
{
"event_id": "evt_abc123",
"event_type": "scrape.completed",
"timestamp": "2026-04-20T10:30:00Z",
"data": {
"job_id": "job_xyz789",
"url": "https://example.com",
"status": "completed",
"results_count": 42,
"credits_used": 18
},
"signature": "sha256=abc123def456..."
}
Key design principles:
- Include an
event_idfor deduplication - Include an
event_typeso one endpoint handles multiple events - Include a
timestampfor debugging and ordering - Make
dataself-contained -- the receiver shouldn't need to call back for details - Always sign the payload for security
How do I secure webhooks?
Three layers of security are standard:
1. HTTPS only. Never accept webhook deliveries over plain HTTP. TLS is non-negotiable.
2. HMAC signatures. Sign every payload so the receiver can verify authenticity:
import hmac
import hashlib
def verify_signature(payload: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(f"sha256={expected}", signature)
3. IP allowlisting. Only accept requests from known IP ranges. SearchHive, for example, publishes their webhook IP ranges in their docs.
What happens if a webhook delivery fails?
Implement an automatic retry mechanism with exponential backoff:
import time
import httpx
def deliver_webhook(url: str, payload: dict, secret: str, max_retries: int = 5):
payload_bytes = json.dumps(payload).encode()
signature = hmac.new(
secret.encode(), payload_bytes, hashlib.sha256
).hexdigest()
for attempt in range(max_retries):
try:
response = httpx.post(
url,
content=payload_bytes,
headers={
"Content-Type": "application/json",
"X-Webhook-Signature": f"sha256={signature}"
},
timeout=10
)
if response.status_code == 200:
return True
print(f"Attempt {attempt + 1} failed: {response.status_code}")
except (httpx.TimeoutException, httpx.ConnectError) as e:
print(f"Attempt {attempt + 1} error: {e}")
# Exponential backoff: 1s, 2s, 4s, 8s, 16s
if attempt < max_retries - 1:
time.sleep(2 ** attempt)
return False
Should I use a single endpoint or multiple?
Start with a single endpoint that routes by event_type. This simplifies infrastructure and configuration:
from fastapi import FastAPI, Request, HTTPException
import hmac
import hashlib
import json
app = FastAPI()
WEBHOOK_SECRET = "your-webhook-secret"
@app.post("/webhooks")
async def handle_webhook(request: Request):
# Verify signature
payload = await request.body()
signature = request.headers.get("X-Webhook-Signature", "")
if not verify_signature(payload, signature, WEBHOOK_SECRET):
raise HTTPException(status_code=401, detail="Invalid signature")
data = json.loads(payload)
event_type = data.get("event_type")
# Route by event type
if event_type == "scrape.completed":
await handle_scrape_completed(data["data"])
elif event_type == "search.completed":
await handle_search_completed(data["data"])
elif event_type == "deepdive.completed":
await handle_deepdive_completed(data["data"])
else:
print(f"Unknown event type: {event_type}")
return {"status": "ok"}
async def handle_scrape_completed(data: dict):
job_id = data["job_id"]
print(f"Scrape {job_id} completed with {data['results_count']} results")
async def handle_search_completed(data: dict):
query = data["query"]
print(f"Search for '{query}' returned {data['results_count']} results")
How do I handle duplicate webhook deliveries?
Webhooks can be delivered more than once. Always deduplicate using the event_id:
import sqlite3
class WebhookProcessor:
def __init__(self, db_path="webhooks.db"):
self.conn = sqlite3.connect(db_path)
self.conn.execute(
"CREATE TABLE IF NOT EXISTS processed_events (event_id TEXT PRIMARY KEY, processed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)"
)
def process(self, event_id: str, handler: callable):
# Check if already processed
existing = self.conn.execute(
"SELECT 1 FROM processed_events WHERE event_id=?", (event_id,)
).fetchone()
if existing:
return {"status": "duplicate"}
# Process the event
handler()
self.conn.execute(
"INSERT INTO processed_events (event_id) VALUES (?)", (event_id,)
)
self.conn.commit()
return {"status": "processed"}
What HTTP status codes should my webhook endpoint return?
- 200 OK -- Successfully received and processed
- 202 Accepted -- Received but will process asynchronously
- 4xx -- Something wrong with the request (bad signature, malformed payload) -- do NOT retry
- 5xx -- Your server had an error -- the sender SHOULD retry
Return 200 even if you haven't fully processed the event yet. The webhook delivery is confirmed; your downstream processing can happen asynchronously. If downstream processing fails, use your own retry/queue system.
How do I set up webhooks with SearchHive?
SearchHive supports custom webhooks on paid plans (Builder and above). Configure them in your dashboard:
# When a scrape job completes, SearchHive sends:
# POST https://yourapp.com/webhooks/searchhive
# Headers: X-Webhook-Signature: sha256=...
# Body: {"event_id": "...", "event_type": "scrape.completed", "data": {...}}
This is useful for long-running jobs. Instead of polling the API to check if a scrape is done, configure a webhook and get notified instantly when results are ready.
How do I test webhooks during development?
Use ngrok or a similar tunneling service to expose your local server to the internet:
# Start your webhook receiver
uvicorn webhook_server:app --port 8000
# In another terminal, expose it
ngrok http 8000
# ngrok gives you a public URL like https://abc123.ngrok.io
Register the ngrok URL as your webhook endpoint. Test deliveries will hit your local machine in real-time.
What about webhook ordering and timing?
Webhooks are delivered in order within a single event type, but not guaranteed across different types. If you need strict ordering, process events from a queue:
import asyncio
from collections import deque
class EventQueue:
def __init__(self):
self.queue = deque()
self.processing = False
async def enqueue(self, event: dict):
self.queue.append(event)
if not self.processing:
asyncio.create_task(self.process_all())
async def process_all(self):
self.processing = True
while self.queue:
event = self.queue.popleft()
await self.handle(event)
self.processing = False
async def handle(self, event: dict):
event_type = event["event_type"]
print(f"Processing {event_type}: {event['event_id']}")
Summary
Webhooks are straightforward to design but require attention to security, reliability, and idempotency. The key principles:
- Sign every payload with HMAC
- Deduplicate by event_id -- deliveries aren't guaranteed to be unique
- Return 200 quickly -- process asynchronously
- Self-contained payloads -- receivers shouldn't need callbacks
- Retry with backoff -- but stop retrying on 4xx errors
For search and scraping webhooks specifically, SearchHive's webhook system on the Builder plan ($49/month) sends real-time notifications when jobs complete, so your pipeline can react instantly without polling. Start with the free tier (500 credits, no card) and upgrade when you need webhooks and higher limits.
Related: /tutorials/api-documentation-generators | /compare/make