When a request to Wafer fails, the response body carries a structured error envelope you can branch on:Documentation Index
Fetch the complete documentation index at: https://docs.wafer.ai/llms.txt
Use this file to discover all available pages before exploring further.
pass.wafer.ai (OpenAI- and Anthropic-compatible
inference endpoints) and from api.wafer.ai (account, billing, and key
management). Anthropic-compatible responses wrap the same fields under
{"type":"error","error":{...}}.
Every response — success or failure — also carries the x-request-id
header. Include that ID when contacting support.
When you include an
x-request-id header on your request, Wafer echoes
it back instead of generating one, so you can correlate with your
application logs. To keep server- and client-supplied IDs separate in
our logs, Wafer prefixes client values with client_. Server-generated
IDs match req_<12 hex> exactly; anything else you send is prefixed.How to use this page
type is the coarse bucket — your SDK probably already maps it to a
typed exception class (RateLimitError, BadRequestError, etc).
code is the specific reason within that bucket and the right thing
to branch on programmatically. Each anchor below documents one code.
Bucket: invalid_request_error (400 / 422)
schema_validation_failed
Status: 422
The request body didn’t match the endpoint’s schema. error.param points at
the offending field path (e.g. messages[0].content, amount_cents).
What to do: read error.param and error.message, fix the field, and
retry.
tool_schema_invalid
Status: 400
A function tool you passed has an invalid parameters block. We pre-validate
that parameters is a JSON Schema object (type: "object" with a
properties dict) before forwarding, because the underlying backends 400 with
unhelpful Pydantic errors.
What to do: confirm tools[i].function.parameters matches the OpenAI
function-calling spec.
tool_choice_unknown_tool
Status: 400
tool_choice.function.name doesn’t match any tool you declared in tools.
What to do: ensure the tool name in tool_choice is one of the
tools[].function.name values in the same request.
orphan_tool_message
Status: 400
A role=tool message references a tool_call_id that no preceding assistant
message issued. Every tool message must follow an assistant message whose
tool_calls[] contains a matching id.
What to do: check that your conversation history is intact — every tool
result must be preceded by the assistant turn that requested it.
missing_tool_call_id
Status: 400
A role=tool message is missing its tool_call_id field. The field is
required so we can pair the result back to the assistant’s request.
unsupported_parameter
Status: 400
A request parameter is not supported on the endpoint or for the selected
model (e.g. previous_response_id, logit_bias on some backends).
What to do: remove the parameter, or check the Models
page for per-model support.
unsupported_tool_type
Status: 400
Only tools[i].type == "function" is supported on the OpenAI-compatible
endpoint. (Anthropic-side tool types are translated upstream.)
duplicate_tool_name
Status: 400
Two entries in tools declare the same function.name. Tool names must be
unique within a request.
context_length_exceeded
Status: 400
The request would exceed the selected model’s context window. The error body
includes structured fields to help you switch models programmatically:
max_tokens, trim the prompt, or retry against one of
suggested_models.
model_not_found
Status: 404
The model value in your request doesn’t match any model your key can
access. The error message includes the list of available models for your
key tier.
Bucket: authentication_error (401)
http_401
Status: 401
Missing or invalid API key. Confirm the Authorization: Bearer <key> header
is present and the key hasn’t been rotated. Pass keys start with wfr_;
serverless keys start with wfr_.
Bucket: permission_error (403)
http_403
Status: 403
The key is valid but doesn’t have access to this endpoint or resource
(e.g. a Pass key trying to hit a dedicated endpoint, or a key from a
different account).
Bucket: rate_limit_error (429)
429 responses include three standard headers SDKs read for backoff:
Retry-After— seconds until the next retry should be attemptedRateLimit-Limit— the cap for this windowRateLimit-Remaining—0when at-capRateLimit-Reset— seconds until the limit resets
concurrency_limit_exceeded
Status: 429
Too many in-flight requests on the account. Wait briefly (1–2s) and retry
with exponential backoff. Retry-After: 1 is set as a hint.
request_quota_exceeded
Status: 429
The account hit its included-request limit for the current window. The body
includes request_limit, window_end, and (when known) plan_tier:
plan_tier is omitted from the body when the account’s plan tier isn’t
known at the edge — branch on key presence, not on null.
What to do: wait for window_end, upgrade your plan, or enable overage
in your dashboard.
rate_limit_exceeded
Status: 429
A general rate-limit hit (e.g. on a control-plane endpoint). Check
Retry-After and back off.
Bucket: insufficient_credits (402)
insufficient_credits
Status: 402
Your Wafer Serverless prepaid balance is insufficient for the request. The
body includes an estimate of available vs. required credits:
topup_url, or enable auto-top-off in your
dashboard.
Bucket: routing_error / server_error (502 / 503 / 504)
no_healthy_backends
Status: 503
All backends for the requested model are currently unavailable. Oncall is
notified automatically when this fires. The error body includes the affected
model.
What to do: retry with exponential backoff. If the issue persists for
more than a minute, check status.wafer.ai or
contact support with your request_id.
backend_timeout
Status: 504
The selected backend accepted the request but didn’t produce a response in
time. The error body includes the model.
What to do: retry. Backends auto-recover; a single 504 is usually
transient.
backend_connect_error
Status: 502
We couldn’t open a connection to the selected backend.
What to do: retry — our router will pick a different backend on the next
attempt.
backend_http_error
Status: 502
The backend returned an HTTP error we couldn’t interpret. Includes model.
What to do: retry. If it persists, send us your request_id.
Bucket: internal_error (500)
internal_error
Status: 500
Something on our side went wrong and the cause didn’t fit any specific
bucket. The request_id is the load-bearing piece — send it with your bug
report and we can find the exact failure in our logs.
What to do: retry once. If it reproduces, file a bug with the
request_id and (if relevant) the request body.
SDK quickstart
Python (openai)
Python (anthropic)
request_id (from the
x-request-id header or the error.request_id field). With that, we can
look up the exact request in our logs.