Files API - Wafer

The Files API lets you upload large media to Wafer once and reference it by file_id in subsequent inference requests. This is the recommended path for any prompt that includes an image, document, or video — uploading once and referencing many times avoids hitting the 50 MB per-request body cap and lets you re-use the same asset across turns without re-encoding it. Files are scoped to the Wafer user who owns the key — uploads made with one of your keys are usable by every other key on the same user account, matching the OpenAI and Anthropic Files API contract.

Endpoints

Method	Path	Host	Auth
`POST`	`/v1/files`	`pass.wafer.ai`	`Authorization: Bearer <wfr_…>` Serverless key
`GET`	`/v1/files`	`api.wafer.ai`	Dashboard session (signed in at app.wafer.ai)
`GET`	`/v1/files/{file_id}`	`api.wafer.ai`	Dashboard session
`DELETE`	`/v1/files/{file_id}`	`api.wafer.ai`	Dashboard session

Upload runs on the inference edge — a wfr_… key is enough. Listing, inspecting, and deleting files are management actions and currently live on app.wafer.ai, behind your dashboard login. For most workflows you upload from code and manage in the UI.

Upload

The upload request body is the raw file bytes — not multipart. File metadata travels in headers:

curl -sS "https://pass.wafer.ai/v1/files" \
  -X POST \
  -H "Authorization: Bearer <YOUR_WAFER_API_KEY>" \
  -H "Content-Type: image/png" \
  -H "X-Wafer-Purpose: vision" \
  -H "X-Wafer-Filename: screenshot.png" \
  --data-binary @./screenshot.png

Required headers:

Authorization: Bearer <key> — your Serverless API key.
Content-Type: <mime> — the file’s MIME type. Must be one of the allowed types for the declared purpose (see below).
X-Wafer-Purpose: <purpose> — one of vision, document, or video.

Optional headers:

X-Wafer-Filename: <label> — a human-readable label. Shown in the dashboard’s file list.
X-Wafer-Extraction-Fps: <fps> — video only (rejected on other purposes). Must be strictly positive. Defaults to 1.0 frame per second. There’s no upper cap, but the worker hard-stops at 3600 extracted frames, so a high fps against a long clip simply truncates the tail.

Purposes and Allowed MIME Types

Purpose	Allowed MIME types
`vision`	`image/jpeg`, `image/png`, `image/gif`, `image/webp`
`document`	`application/pdf`, `text/plain`
`video`	`video/mp4`, `video/quicktime`, `video/webm`, `video/x-matroska`

Max file size is 512 MB regardless of purpose. Uploads larger than that return request_body_too_large (see Error Reference).

File Object

{
  "id": "file_8a3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e",
  "purpose": "vision",
  "filename": "screenshot.png",
  "mime_type": "image/png",
  "bytes": 184302,
  "status": "ready",
  "created_at": "2026-06-04T15:55:19+00:00",
  "expires_at": "2026-07-04T15:55:19+00:00"
}

id always starts with file_ followed by 32 hex characters (a UUID with the hyphens stripped). Use this string verbatim when referencing the file in inference requests.
status is ready for images and documents as soon as the upload completes. Video uploads return pending / processing while server-side frame extraction runs. Only ready files are usable in inference.
expires_at is set on every file. Re-upload before that timestamp if you need to keep the file around longer.

Reference a File in a Chat Completion

Once a file is ready, pass its file_id in a content block on POST /v1/chat/completions. Wafer resolves the file_id server-side and substitutes the file contents into the request before forwarding it to the backend.

Vision (image)

OpenAI-compatible shape on /v1/chat/completions:

curl -sS "https://pass.wafer.ai/v1/chat/completions" \
  -H "Authorization: Bearer <YOUR_WAFER_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3.5-397B-A17B",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this screenshot?"},
        {"type": "image_url", "image_url": {"file_id": "file_8a3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e"}}
      ]
    }],
    "max_tokens": 256
  }'

Anthropic Messages shape on /v1/messages:

{
  "type": "image",
  "source": {"type": "file", "file_id": "file_8a3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e"}
}

Video

Video files are expanded server-side into a sequence of image frames before the inference call. Each frame is annotated with frame_index and timestamp_sec so the model can reason about temporal order.

{
  "type": "video",
  "source": {
    "type": "file",
    "file_id": "file_…",
    "fps": 1.0,
    "max_frames": 60
  }
}

fps (optional): subsample the pre-extracted frames at this rate. Must be > 0. Defaults to the extraction rate the file was uploaded with.
max_frames (optional): cap the number of frames forwarded to the model. Use this to keep long videos under the model’s context window.

The full expanded frame set counts against the model’s context window — pick fps and max_frames accordingly. If you exceed the context window, the API returns context_length_exceeded with the limit and suggested_models.

List, Inspect, Delete (dashboard)

File management lives in the dashboard at app.wafer.ai. Use the UI to see your uploaded files, inspect status, and revoke files you no longer need. The underlying endpoints — GET /v1/files, GET /v1/files/{file_id}, DELETE /v1/files/{file_id} — live on api.wafer.ai and are guarded by your dashboard login session, so they are not intended to be called directly from CI or production code paths. If you need to drop a file programmatically, delete it from the dashboard. We surface a single-key API for upload because that’s the path that has to happen from the same place inference happens; management is rarer and stays in the UI.

Errors (Upload)

The upload endpoint uses its own focused error envelope. Codes that customers actually encounter:

400 invalid_purpose — X-Wafer-Purpose was missing or not one of vision, document, video.
400 missing_content_type — Content-Type header was empty.
400 invalid_content_length — Content-Length was missing, non-numeric, or negative.
400 invalid_extraction_fps — X-Wafer-Extraction-Fps was sent on a non-video purpose, was non-numeric, or was ≤ 0.
400 init_rejected — wafer-api’s pre-flight rejected the upload (e.g. MIME doesn’t match the declared purpose). The message field carries the upstream reason.
403 files_require_user_key — the inference key isn’t user-scoped. Older non-user keys can serve inference but cannot own files; mint a new key from the dashboard.
413 file_too_large — upload exceeded the 512 MB per-file limit (either via Content-Length or counted at stream time).
502 init_unavailable / storage_unavailable / storage_upload_rejected / complete_unavailable — transient failures in the upload pipeline. Retry.

Inference-time file_id resolution surfaces its own codes (invalid_file_id, file_not_ready, file_expired, file_unavailable) on the chat completion response when the referenced file isn’t usable.

Notes

File contents are inaccessible to other users — Wafer scopes every read to the owning user.
Files are scoped to the region and hostname where they are uploaded. Wafer’s public Serverless docs currently describe the default pass.wafer.ai endpoint only.
Inline image_url.url data URIs and remote HTTPS URLs are still accepted on chat completions for backwards compatibility, but file_id is the recommended path for anything over a few hundred KB.

​Endpoints

​Upload

​Purposes and Allowed MIME Types

​File Object

​Reference a File in a Chat Completion

​Vision (image)

​Video

​List, Inspect, Delete (dashboard)

​Errors (Upload)

​Notes