Documentation Index
Fetch the complete documentation index at: https://docs.impulselabs.ai/llms.txt
Use this file to discover all available pages before exploring further.
All error responses return JSON with at least an error or detail field.
Error response shape
{
"error": "Human-readable error message",
"detail": "Additional context (FastAPI endpoints)"
}
HTTP status codes
| Code | Meaning | Common causes |
|---|
400 | Bad Request | Missing required fields, invalid JSON, schema validation failure |
401 | Unauthorized | Missing Authorization header, invalid or revoked API key |
403 | Forbidden | Attempting to access another user’s resource; plan does not permit API keys |
404 | Not Found | Deployment ID does not exist for your account |
409 | Conflict | A deployment with that ID already exists |
429 | Too Many Requests | Rate limit exceeded for your plan tier |
500 | Internal Server Error | Unexpected server-side error |
503 | Service Unavailable | Model container is not yet active or is temporarily unreachable |
Error examples
401 — Invalid API key
{
"valid": false,
"error": "Invalid API key"
}
The key is missing, malformed, or has been revoked. Double-check the Authorization header.
403 — Plan upgrade required
{
"error": "Upgrade required",
"message": "API keys are available on Pro plans and above. Upgrade at /billing."
}
404 — Deployment not found
{
"detail": "No deployment 'my-model' found for user"
}
The deployment_id either does not exist or belongs to a different user.
409 — Duplicate deployment
{
"detail": "Deployment 'my-model' already exists"
}
429 — Rate limit exceeded
{
"valid": false,
"error": "Rate limit exceeded",
"limit": 120,
"retry_after_seconds": 34
}
Wait the number of seconds in retry_after_seconds before retrying. See also the Retry-After response header.
503 — Deployment not active
{
"detail": "Deployment 'my-model' is not active (status: BUILDING)"
}
The model container is still starting up. Check the status and retry when ACTIVE.
503 — Container unreachable
{
"detail": "Deployment 'my-model' is not reachable. Retry shortly"
}
The container is registered as ACTIVE but did not respond. Usually resolves within seconds — retry with exponential backoff.
Handling errors — recommended pattern
import os, time, requests
def infer_with_retry(deployment_id: str, inputs: dict, max_retries: int = 3) -> dict:
url = "https://inference.impulselabs.ai/infer"
headers = {
"Authorization": f"Bearer {os.environ['IMPULSE_API_KEY']}",
"Content-Type": "application/json",
}
payload = {"deployment_id": deployment_id, "inputs": inputs}
for attempt in range(max_retries):
resp = requests.post(url, headers=headers, json=payload)
if resp.status_code == 200:
return resp.json()
if resp.status_code == 429:
retry_after = int(resp.headers.get("Retry-After", 5))
print(f"Rate limited — waiting {retry_after}s")
time.sleep(retry_after)
continue
if resp.status_code in (502, 503):
wait = 2 ** attempt
print(f"Service unavailable — retrying in {wait}s")
time.sleep(wait)
continue
# Non-retryable error
resp.raise_for_status()
raise RuntimeError(f"Failed after {max_retries} attempts")
async function inferWithRetry(deploymentId, inputs, maxRetries = 3) {
const url = "https://inference.impulselabs.ai/infer";
for (let attempt = 0; attempt < maxRetries; attempt++) {
const resp = await fetch(url, {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.IMPULSE_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ deployment_id: deploymentId, inputs }),
});
if (resp.ok) return resp.json();
if (resp.status === 429) {
const retryAfter = parseInt(resp.headers.get("Retry-After") ?? "5", 10);
await new Promise((r) => setTimeout(r, retryAfter * 1000));
continue;
}
if (resp.status === 503 || resp.status === 502) {
await new Promise((r) => setTimeout(r, 2 ** attempt * 1000));
continue;
}
const body = await resp.json().catch(() => ({}));
throw Object.assign(new Error(body.detail ?? body.error ?? "Request failed"), { status: resp.status });
}
throw new Error(`Failed after ${maxRetries} attempts`);
}