Building a Python SDK for a web API sounds straightforward until you hit the real questions. How do you handle auth? What about retries? Should you go async? How do you test against a live API without burning credits? This FAQ covers the decisions that matter when you're shipping a SDK that developers actually want to use.
Key Takeaways
- A good Python SDK wraps HTTP complexity behind a clean, typed interface with sensible defaults
- Async support is expected in 2025 — design for it from day one, even if you ship sync first
- Retry logic, rate limit handling, and clear error types separate production SDKs from weekend projects
- Pydantic models for request/response validation catch bugs before they hit the wire
- Publishing to PyPI with CI/CD is table stakes — automate it from the start
What project structure should a Python SDK use?
Keep it flat and discoverable. Developers importing your package should never have to guess where things live:
searchhive-python/
├── src/
│ └── searchhive/
│ ├── __init__.py # Public API surface
│ ├── client.py # Main client class
│ ├── models.py # Pydantic request/response models
│ ├── errors.py # Custom exception hierarchy
│ ├── config.py # Client configuration
│ └── _types.py # Type aliases, protocols
├── tests/
│ ├── conftest.py
│ ├── test_client.py
│ └── test_models.py
├── pyproject.toml
└── README.md
The src/ layout prevents accidental imports from the working directory during development. Pydantic models in models.py give you validation and serialization for free — don't skip them.
How should authentication work?
Support multiple auth methods with a single entry point. Most APIs use one of: API keys (header or query param), Bearer tokens, or OAuth2. Your client should accept the most common method as a simple kwarg and expose the rest through an auth object:
from searchhive import SearchHiveClient
# Simple — API key as string
client = SearchHiveClient(api_key="sh_live_abc123")
# Explicit — auth object for more control
from searchhive import BearerAuth
client = SearchHiveClient(auth=BearerAuth(token="sh_live_abc123"))
Never hardcode credentials. Read from environment variables as a fallback — SEARCHHIVE_API_KEY is the convention — but let the explicit kwarg override it. This makes testing easy and lets developers manage secrets their way.
Should the SDK be async or sync?
Both. Use httpx instead of requests — it supports sync and async with the same interface. Ship an AsyncSearchHiveClient alongside SearchHiveClient:
# Sync
from searchhive import SearchHiveClient
with SearchHiveClient(api_key="...") as client:
results = client.search("web scraping tools")
# Async
from searchhive import AsyncSearchHiveClient
async with AsyncSearchHiveClient(api_key="...") as client:
results = await client.search("web scraping tools")
If you're short on time, ship sync first. But structure your internals so adding async later doesn't require a rewrite. The httpx approach means your transport layer is identical — only the calling convention changes.
How do you handle rate limiting properly?
Three things: respect the headers, implement backoff, and expose the state.
Most APIs return rate limit info in response headers (X-RateLimit-Remaining, X-RateLimit-Reset). Parse those. When you're close to the limit, either wait proactively or queue the request:
import time
import httpx
def _handle_rate_limit(response: httpx.Response) -> None:
remaining = response.headers.get("X-RateLimit-Remaining")
reset_at = response.headers.get("X-RateLimit-Reset")
if remaining and int(remaining) <= 1:
wait_seconds = max(0, float(reset_at) - time.time()) if reset_at else 60
time.sleep(wait_seconds)
For a more robust approach, use a token bucket algorithm. The aiolimiter package handles this well for async code. The key insight: don't just react to 429 errors — avoid them by throttling proactively.
What's the right way to do retries?
Exponential backoff with jitter, and only retry on transient errors. Never retry a 400 or 401 — those are your problem, not the network's:
import random
from tenacity import retry, stop_after_attempt, wait_exponential_jitter, retry_if_exception
def is_retryable(exc: Exception) -> bool:
if isinstance(exc, httpx.HTTPStatusError):
return exc.response.status_code in (429, 500, 502, 503, 504)
return isinstance(exc, (httpx.ConnectError, httpx.ReadTimeout))
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=0.5, max=10),
retry=retry_if_exception(is_retryable),
)
def _request(self, method: str, path: str, **kwargs):
response = self._client.request(method, path, **kwargs)
response.raise_for_status()
return response.json()
The tenacity library handles the retry logic cleanly. Jitter prevents the thundering herd problem when multiple clients retry simultaneously after a server blip.
How should errors be structured?
A hierarchy of exceptions that lets callers catch at the right level of specificity:
SearchHiveError # Base — catch everything
├── AuthenticationError # 401, 403
├── RateLimitError # 429 (includes retry_after)
├── ValidationError # 400, 422
├── NotFoundError # 404
└── ServerError # 5xx
Always include the HTTP status code, the original response body, and any request ID the API returns. When a developer files a bug, they should be able to paste the exception and give you everything you need:
class RateLimitError(SearchHiveError):
def __init__(self, message: str, retry_after: int, request_id: str):
super().__init__(message)
self.retry_after = retry_after
self.request_id = request_id
def __str__(self):
return f"{self.args[0]} (retry_after={self.retry_after}s, request_id={self.request_id})"
How do you test a SDK without hitting the live API?
Use respx (for httpx) or responses (for requests) to mock at the HTTP layer. Test your client logic, not the API's behavior:
import respx
from searchhive import SearchHiveClient
@respx.mock
def test_search_returns_results():
respx.get("https://api.searchhive.dev/v1/search").mock(
return_value=httpx.Response(200, json={"results": [{"title": "Test"}]})
)
client = SearchHiveClient(api_key="test_key")
results = client.search("test query")
assert len(results) == 1
Record real API responses with vcrpy or recaptcha for integration tests, but keep your unit test suite fast and offline. CI shouldn't depend on external services.
Should I use Pydantic for request/response models?
Yes. Pydantic gives you type validation, serialization, and IDE autocompletion with zero boilerplate. Define your models once, use them everywhere:
from pydantic import BaseModel, Field
class SearchRequest(BaseModel):
query: str = Field(..., min_length=1, max_length=500)
num_results: int = Field(default=10, ge=1, le=100)
language: str | None = None
class SearchResult(BaseModel):
title: str
url: str
snippet: str
published_date: str | None = None
class SearchResponse(BaseModel):
results: list[SearchResult]
total: int
page: int
This catches invalid inputs before the HTTP call and gives consumers typed access to responses. The Field constraints act as a first line of validation — no need to write manual checks.
How do you version the SDK alongside the API?
Use semantic versioning for the SDK package independently of the API version. The API version goes in the URL path (/v1/search); the SDK version reflects changes to the wrapper itself:
- Patch (1.0.1): Bug fixes, no API changes
- Minor (1.1.0): New endpoints, new optional parameters, backward compatible
- Major (2.0.0): Breaking changes to the Python API
Pin the API version in your base URL and only bump it when the API team ships a new major version. Expose a api_version parameter on the client for advanced users who want to opt into preview endpoints.
What about pagination?
Provide a clean iterator interface. Don't make callers manage page tokens manually:
# Instead of manual pagination
page = client.search("test", page=1)
while page.has_more:
page = client.search("test", page=page.next_page)
# Provide an iterator
for result in client.search_iter("test", limit=100):
process(result)
Implement __iter__ and __aiter__ on a dedicated paginator class. This handles cursor-based, offset-based, and token-based pagination under the hood while giving callers a simple loop. SearchHive's Python SDK uses this pattern for SwiftSearch results — /tutorials/scrape-amazon-product-data.
How do you publish to PyPI properly?
Automate everything through GitHub Actions:
- Tag a release (
git tag v1.2.0) - CI builds the package, runs tests, publishes to PyPI
- Trusted publishing (OIDC) — no API tokens to manage
# pyproject.toml
[project]
name = "searchhive-python"
version = "1.2.0"
requires-python = ">=3.9"
dependencies = ["httpx>=0.25", "pydantic>=2.0"]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
Use hatchling or setuptools — both work. The important thing is the CI pipeline: tests pass → tag → publish. No manual steps.
Building a Python SDK isn't just wrapping HTTP calls in a class. It's about making the API feel like it was written natively in Python — typed, documented, forgiving on transient failures, and strict on input validation. The patterns above are what separate an SDK developers reach for from one they tolerate.
SearchHive's Python SDK implements these patterns out of the box — async/sync clients, Pydantic models, rate limit awareness, and clean error types. Install it with pip install searchhive-python and check the docs to get started in under five minutes.