Skip to main content
All error responses return JSON with at least an error or detail field.

Error response shape

{
  "error": "Human-readable error message",
  "detail": "Additional context (FastAPI endpoints)"
}

HTTP status codes

CodeMeaningCommon causes
400Bad RequestMissing required fields, invalid JSON, schema validation failure
401UnauthorizedMissing Authorization header, invalid or revoked API key
403ForbiddenAttempting to access another user’s resource; plan does not permit API keys
404Not FoundDeployment ID does not exist for your account
409ConflictA deployment with that ID already exists
429Too Many RequestsRate limit exceeded for your plan tier
500Internal Server ErrorUnexpected server-side error
503Service UnavailableModel container is not yet active or is temporarily unreachable

Error examples

401 — Invalid API key

{
  "valid": false,
  "error": "Invalid API key"
}
The key is missing, malformed, or has been revoked. Double-check the Authorization header.

403 — Plan upgrade required

{
  "error": "Upgrade required",
  "message": "API keys are available on Pro plans and above. Upgrade at /billing."
}

404 — Deployment not found

{
  "detail": "No deployment 'my-model' found for user"
}
The deployment_id either does not exist or belongs to a different user.

409 — Duplicate deployment

{
  "detail": "Deployment 'my-model' already exists"
}

429 — Rate limit exceeded

{
  "valid": false,
  "error": "Rate limit exceeded",
  "limit": 120,
  "retry_after_seconds": 34
}
Wait the number of seconds in retry_after_seconds before retrying. See also the Retry-After response header.

503 — Deployment not active

{
  "detail": "Deployment 'my-model' is not active (status: BUILDING)"
}
The model container is still starting up. Check the status and retry when ACTIVE.

503 — Container unreachable

{
  "detail": "Deployment 'my-model' is not reachable. Retry shortly"
}
The container is registered as ACTIVE but did not respond. Usually resolves within seconds — retry with exponential backoff.
Python
import os, time, requests

def infer_with_retry(deployment_id: str, inputs: dict, max_retries: int = 3) -> dict:
    url = "https://inference.impulselabs.ai/infer"
    headers = {
        "Authorization": f"Bearer {os.environ['IMPULSE_API_KEY']}",
        "Content-Type": "application/json",
    }
    payload = {"deployment_id": deployment_id, "inputs": inputs}

    for attempt in range(max_retries):
        resp = requests.post(url, headers=headers, json=payload)

        if resp.status_code == 200:
            return resp.json()

        if resp.status_code == 429:
            retry_after = int(resp.headers.get("Retry-After", 5))
            print(f"Rate limited — waiting {retry_after}s")
            time.sleep(retry_after)
            continue

        if resp.status_code in (502, 503):
            wait = 2 ** attempt
            print(f"Service unavailable — retrying in {wait}s")
            time.sleep(wait)
            continue

        # Non-retryable error
        resp.raise_for_status()

    raise RuntimeError(f"Failed after {max_retries} attempts")
Node.js
async function inferWithRetry(deploymentId, inputs, maxRetries = 3) {
  const url = "https://inference.impulselabs.ai/infer";

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const resp = await fetch(url, {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.IMPULSE_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ deployment_id: deploymentId, inputs }),
    });

    if (resp.ok) return resp.json();

    if (resp.status === 429) {
      const retryAfter = parseInt(resp.headers.get("Retry-After") ?? "5", 10);
      await new Promise((r) => setTimeout(r, retryAfter * 1000));
      continue;
    }

    if (resp.status === 503 || resp.status === 502) {
      await new Promise((r) => setTimeout(r, 2 ** attempt * 1000));
      continue;
    }

    const body = await resp.json().catch(() => ({}));
    throw Object.assign(new Error(body.detail ?? body.error ?? "Request failed"), { status: resp.status });
  }

  throw new Error(`Failed after ${maxRetries} attempts`);
}