Wafer Pass Setup

Features, model availability, rate limits, and pricing may change as we iterate. Questions? Email emilio@wafer.ai.

Set up with: Claude Code · Conductor · Codex · OpenClaw · Hermes Agent · Cline · Roo Code · Kilo Code · OpenHands · LibreChat Wafer builds AI that optimizes AI. We take open models and make them dramatically faster. Wafer Pass gives you Qwen3.5-397B-A17B and GLM-5.1, served at multiples of the speed of generic inference providers. More models land on the same subscription — no price increase. Wafer Pass is built for Claude Code, Codex, Conductor, OpenClaw, Hermes Agent, Cline, Roo Code, Kilo Code, OpenHands, and other agent harnesses. Wafer exposes both an OpenAI-compatible endpoint and an Anthropic-compatible Messages endpoint, so tools like Claude Code work out of the box. Get a Wafer Pass for fast open-source models through a standard API endpoint. Plans start at $10/week.

Get your Wafer Pass: https://www.wafer.ai/pass

Connection Details

Use your Wafer Pass API key with these values:


OpenAI-compatible endpoint	`https://pass.wafer.ai/v1`
Anthropic-compatible endpoint	`https://pass.wafer.ai/v1/messages`
Send your API key as	`Authorization: Bearer <key>` (`wfr_…` Pass keys work here)
Request-scoped ZDR	Add `Wafer-ZDR: required` on direct API calls

See Models below for the model strings to pass on the OpenAI-compatible endpoint.

Claude Code uses the Anthropic Messages endpoint. Set ANTHROPIC_BASE_URL=https://pass.wafer.ai and ANTHROPIC_API_KEY to your Wafer key — Claude Code will hit /v1/messages automatically. To make sure Claude Code talks to a Wafer model, set ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL, ANTHROPIC_DEFAULT_HAIKU_MODEL, and CLAUDE_CODE_SUBAGENT_MODEL to one of the Wafer model IDs in Models (e.g. GLM-5.1). All other harnesses (OpenClaw, Cline, Roo Code, etc.) use the OpenAI-compatible endpoint at https://pass.wafer.ai/v1.

What’s Included

With an active Wafer Pass subscription you get:

Qwen3.5-397B-A17B and GLM-5.1 requests included in your plan’s 5-hour window limit at zero per-token cost; overage is billed only for models with enabled per-token rates
Access through a standard OpenAI-compatible API and an Anthropic-compatible Messages API using your Wafer API key
Works with Claude Code, Codex, Conductor, OpenClaw, Hermes Agent, Cline, Roo Code, Kilo Code, OpenHands, and other agent harnesses
New fast models as we release them — same subscription, no price increase

Models

`model` string	Family	Max context tokens	ZDR support	Notes
`Qwen3.5-397B-A17B`	Qwen3.5, 397B MoE	`262144`	Yes	Multiples faster than base SGLang on Wafer’s stack
`GLM-5.1`	Z.AI flagship	`202752`	Yes

The Max context tokens value is the hard cap enforced by the backend — requests where prompt tokens exceed this value return a 400. If your harness has a Context Window Size setting (Cline, Roo Code, Droid, etc.), set it to the exact integer above. Leave ~2–4k of headroom for the model’s response when filling context. Pass any model string above to any OpenAI-compatible harness configured against https://pass.wafer.ai/v1. Model names are case-insensitive — GLM-5.1 and glm-5.1 both work. If you pass a model name that doesn’t match any available model, the API returns a 404 with the list of available models. For Claude Code and other Anthropic-compatible harnesses, set ANTHROPIC_DEFAULT_*_MODEL to one of the model IDs above. See Set Up Claude Code.

API Capabilities

Wafer Pass supports account-aware privacy enforcement and advanced completion controls:

Pricing

Pay weekly, monthly, or save 20% off the weekly rate with yearly billing.

Weekly

Plan	For	Price	Requests / 5hr window	Includes
Starter	Solo devs, daily agents	$10/wk	1,000	Access to every model Wafer hosts
Privacy	Production agents, private workloads	$25/wk	2,000	Zero Data Retention

Monthly

Plan	For	Price	Requests / 5hr window	Includes
Starter	Solo devs, daily agents	$40/mo	1,000	Access to every model Wafer hosts
Privacy	Production agents, private workloads	$100/mo	2,000	Zero Data Retention

Yearly (20% off the weekly rate)

Plan	Price	Effective weekly	Requests / 5hr window
Starter	$416/yr	$8/wk	1,000
Privacy	$1040/yr	$20/wk	2,000

Overage Pricing

Requests beyond your plan’s included 5-hour window limit are billed at per-model API rates for models with overage enabled. All users pay the same overage rate regardless of plan tier.

Model	Input	Output	Cached Input
`Qwen3.5-397B-A17B`	$0.60/M tokens	$3.60/M tokens	$0.06/M tokens
`GLM-5.1`	$1.50/M tokens	$4.50/M tokens	$0.15/M tokens

Overage charges are calculated per 5-hour window and added to your next invoice.

Getting Started

Pick a plan

Go to wafer.ai/pass and choose your plan and billing interval. Checkout is self-serve and instant.

Get your API key

Your API key is shown on the success page right after checkout. We also email a backup copy to the address you used.

Start coding

Use the key in Claude Code, Codex, Conductor, OpenClaw, Cline, Roo Code, Kilo Code, Hermes Agent, OpenHands, or any other supported harness.

Set Up Claude Code

Wafer exposes an Anthropic-compatible Messages endpoint at https://pass.wafer.ai/v1/messages, so Claude Code can connect directly — no proxy needed. For Claude Code, set ANTHROPIC_BASE_URL to https://pass.wafer.ai, not https://pass.wafer.ai/v1.

Install Claude Code

npm install -g @anthropic-ai/claude-code

Configure Wafer as the endpoint

Set these environment variables in your shell profile (~/.zshrc, ~/.bashrc, etc.):

export ANTHROPIC_BASE_URL=https://pass.wafer.ai
export ANTHROPIC_API_KEY=YOUR_WAFER_API_KEY

Or add them to ~/.claude/settings.json for a persistent, per-user config:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://pass.wafer.ai",
    "ANTHROPIC_API_KEY": "YOUR_WAFER_API_KEY"
  }
}

Replace YOUR_WAFER_API_KEY with your Wafer Pass API key.

Do not share your API key or commit it to version control.

Pin a Wafer model

Claude Code sends Anthropic model strings (claude-opus-…, claude-sonnet-…) by default — those don’t match a Wafer model. Pin Claude Code to a Wafer model with these env vars:For Qwen3.5-397B-A17B:

export ANTHROPIC_DEFAULT_OPUS_MODEL="Qwen3.5-397B-A17B"
export ANTHROPIC_DEFAULT_SONNET_MODEL="Qwen3.5-397B-A17B"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="Qwen3.5-397B-A17B"
export CLAUDE_CODE_SUBAGENT_MODEL="Qwen3.5-397B-A17B"

For GLM-5.1:

export ANTHROPIC_DEFAULT_OPUS_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_SONNET_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="GLM-5.1"
export CLAUDE_CODE_SUBAGENT_MODEL="GLM-5.1"

Or in ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://pass.wafer.ai",
    "ANTHROPIC_API_KEY": "YOUR_WAFER_API_KEY",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "GLM-5.1",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "GLM-5.1",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "GLM-5.1",
    "CLAUDE_CODE_SUBAGENT_MODEL": "GLM-5.1"
  }
}

These env vars follow the same pattern as OpenRouter’s Claude Code integration. They override the model Claude Code sends for opus, sonnet, haiku, and subagent calls.

Start Claude Code

claude

Claude Code now routes requests through the Wafer endpoint.

Set Up Conductor

Conductor runs a team of parallel Claude Code agents in isolated Git worktrees on macOS. Because Conductor uses Claude Code under the hood, it picks up Wafer the same way Claude Code does — pin the model to any Wafer Pass ID from Models.

Install Conductor

Download the macOS app from conductor.build and launch it.

Configure Wafer as the endpoint

In Conductor’s Settings → Environment (or your shell profile, e.g. ~/.zshrc), set:

export ANTHROPIC_BASE_URL=https://pass.wafer.ai
export ANTHROPIC_API_KEY=YOUR_WAFER_API_KEY

Replace YOUR_WAFER_API_KEY with your Wafer Pass API key.

Do not share your API key or commit it to version control.

Pin a Wafer model

Conductor spawns Claude Code, which sends Anthropic model strings by default — those don’t match a Wafer model. Pin Claude Code (and therefore Conductor) to a Wafer model with these env vars in the same Environment section:For Qwen3.5-397B-A17B:

export ANTHROPIC_DEFAULT_OPUS_MODEL="Qwen3.5-397B-A17B"
export ANTHROPIC_DEFAULT_SONNET_MODEL="Qwen3.5-397B-A17B"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="Qwen3.5-397B-A17B"
export CLAUDE_CODE_SUBAGENT_MODEL="Qwen3.5-397B-A17B"

For GLM-5.1:

export ANTHROPIC_DEFAULT_OPUS_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_SONNET_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="GLM-5.1"
export CLAUDE_CODE_SUBAGENT_MODEL="GLM-5.1"

This follows the pattern OpenRouter documents in its Claude Code integration guide. Because Conductor spawns Claude Code subprocesses, the same env vars override opus, sonnet, haiku, and subagent calls.

Start a run

Create a workspace in Conductor, pick a repo, and kick off an agent. Requests now route through Wafer.

Set Up Codex

Codex (the OpenAI Codex CLI) only speaks the OpenAI Responses API (/v1/responses). The legacy wire_api = "chat" setting was deprecated in December 2025 and removed in February 2026, so Codex can no longer talk to a Chat Completions endpoint directly. Wafer Pass is OpenAI Chat Completions–compatible (/v1/chat/completions), so to use it from Codex you run a tiny local LiteLLM proxy that translates Responses API requests into Chat Completions on the way to Wafer. This is the path OpenAI’s own Codex deprecation notice points users at.

Once the proxy is running, Codex sends /v1/responses requests to LiteLLM on localhost:4000, LiteLLM rewrites them as /v1/chat/completions against https://pass.wafer.ai/v1, and the response stream is translated back into Responses-API SSE events. You only set this up once.

Install Codex

npm install -g @openai/codex

Verify Codex picks up the Responses API (wire_api = "responses" is the only supported value as of 0.92.0+):

codex --version

Create a LiteLLM proxy config

Create litellm_config.yaml in a directory of your choice (e.g. ~/.codex/litellm_config.yaml):

model_list:
  - model_name: Qwen3.5-397B-A17B
    litellm_params:
      model: openai/Qwen3.5-397B-A17B
      api_base: https://pass.wafer.ai/v1
      api_key: os.environ/WAFER_API_KEY
  - model_name: GLM-5.1
    litellm_params:
      model: openai/GLM-5.1
      api_base: https://pass.wafer.ai/v1
      api_key: os.environ/WAFER_API_KEY
litellm_settings:
  drop_params: true

The openai/ prefix tells LiteLLM to call Wafer over the OpenAI Chat Completions wire format. drop_params: true lets LiteLLM silently drop Responses-only fields that don’t have a Chat Completions equivalent.

Do not share your API key or commit it to version control. The config above reads it from the WAFER_API_KEY env var.

Start the LiteLLM proxy

Docker (recommended)
pip

export WAFER_API_KEY=YOUR_WAFER_API_KEY

docker run -d --name litellm-wafer \
  -p 127.0.0.1:4000:4000 \
  -v "$HOME/.codex/litellm_config.yaml:/app/config.yaml" \
  -e WAFER_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml --host 0.0.0.0 --port 4000

The host-side 127.0.0.1: prefix on -p keeps the proxy reachable only from your machine. The container itself listens on 0.0.0.0 inside the container so Docker’s port forward works.

pip install 'litellm[proxy]'
export WAFER_API_KEY=YOUR_WAFER_API_KEY
litellm --config ~/.codex/litellm_config.yaml --host 127.0.0.1 --port 4000

The local proxy intentionally has no auth configured (it’s only meant to be reached by Codex on the same machine). Always bind it to 127.0.0.1 as shown above. If you need the proxy reachable over a network, set a general_settings.master_key in litellm_config.yaml and require it on every inbound request — see LiteLLM virtual keys.

Confirm it’s translating correctly:

curl -sS http://127.0.0.1:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_WAFER_API_KEY>" \
  -d '{
    "model": "GLM-5.1",
    "input": "Reply with the single word: ready."
  }'

A 200 response with an output array means the bridge is healthy.

Point Codex at the local proxy

Add a Wafer model provider to ~/.codex/config.toml:

model = "GLM-5.1"
model_provider = "wafer"

[model_providers.wafer]
name = "Wafer (via LiteLLM)"
base_url = "http://127.0.0.1:4000/v1"
env_key = "WAFER_API_KEY"
wire_api = "responses"

env_key tells Codex which env var to read for the bearer token it sends to the local proxy. LiteLLM accepts that token (no proxy-side auth is configured here) and uses its own upstream api_key: os.environ/WAFER_API_KEY from litellm_config.yaml to call Wafer. Both ultimately read the same WAFER_API_KEY env var, so the call to Wafer is authenticated with your Wafer Pass key.

Switch models by changing the top-level model = "..." to any model_name from your litellm_config.yaml (e.g. GLM-5.1 or Qwen3.5-397B-A17B).

Run Codex

export WAFER_API_KEY=YOUR_WAFER_API_KEY
codex

Codex now sends Responses-API traffic to LiteLLM, which forwards it to Wafer. The footer in the TUI should show the Wafer model id (e.g. GLM-5.1).

Why this isn’t just two env vars (yet): Codex’s wire_api = "chat" removal means every third-party OpenAI-compatible provider that doesn’t natively expose /v1/responses needs a translation layer right now. We track Wafer adding a native Responses endpoint on the Wafer Pass roadmap; when it ships, you’ll be able to drop the LiteLLM step and point Codex straight at https://pass.wafer.ai/v1.

Set Up OpenClaw

Model string: the examples below use GLM-5.1. Swap in any Models ID (Qwen3.5-397B-A17B or GLM-5.1). Applies to every OpenAI-compatible setup section that follows (OpenClaw, Hermes Agent, Cline, Roo Code, Kilo Code, OpenHands, and the generic section at the bottom).

Install OpenClaw

macOS / Linux
Windows (PowerShell)

curl -fsSL https://openclaw.ai/install.sh | bash

iwr -useb https://openclaw.ai/install.ps1 | iex

Run setup

openclaw setup

Add Wafer as a provider

Replace YOUR_WAFER_API_KEY with your Wafer Pass API key:

openclaw config set models.providers.wafer "$(cat <<'EOF'
{
  "baseUrl": "https://pass.wafer.ai/v1",
  "api": "openai-completions",
  "auth": "api-key",
  "apiKey": "YOUR_WAFER_API_KEY",
  "models": [
    { "id": "Qwen3.5-397B-A17B", "name": "Qwen 3.5 397B" },
    { "id": "GLM-5.1", "name": "GLM 5.1" }
  ]
}
EOF
)"
openclaw models set wafer/GLM-5.1

Do not share your API key or commit it to version control.

Test it

openclaw agent --local --session-id wafer-test --message "Hello"

Set Up Hermes Agent

Install Hermes Agent

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc   # or source ~/.zshrc

Point Hermes at Wafer

Replace YOUR_WAFER_API_KEY with your Wafer Pass API key:

hermes config set OPENAI_BASE_URL https://pass.wafer.ai/v1
hermes config set OPENAI_API_KEY YOUR_WAFER_API_KEY
hermes config set model Qwen3.5-397B-A17B

Start a session

hermes

Hermes now uses Qwen3.5-397B-A17B through the Wafer endpoint by default.

Set Up Cline

Install Cline

Install the Cline extension from the VS Code marketplace, or search “Cline” in VS Code Extensions.

Configure Wafer as a provider

Open VS Code and click the Cline icon in the sidebar
Click the settings gear icon in the Cline panel
In the API Provider dropdown, select OpenAI Compatible
Fill in these fields:

Base URL: https://pass.wafer.ai/v1
API Key: your Wafer API key
Model ID: Qwen3.5-397B-A17B

Do not include /chat/completions in the Base URL — Cline appends that automatically.

Set model info (recommended)

Expand Model Configuration and set:

Context Window Size: 262144
Max Output Tokens: 32768
Supports Images: unchecked

Verify the connection

Send a message in the Cline panel. If Cline responds, you’re connected.

Set Up Roo Code

Install Roo Code

Install the Roo Code extension from the VS Code marketplace, or search “Roo Code” in VS Code Extensions.

Configure Wafer as a provider

Open VS Code and click the Roo Code icon in the sidebar
Click the settings gear icon in the Roo Code panel
In the API Provider dropdown, select OpenAI Compatible
Fill in these fields:

Base URL: https://pass.wafer.ai/v1
API Key: your Wafer API key
Model ID: Qwen3.5-397B-A17B

Set model info (recommended)

Optionally configure:

Context Window Size: 262144
Max Output Tokens: 32768

Start coding

Send a message in the Roo Code panel to confirm the connection.

Set Up Kilo Code

Install Kilo Code

Install the Kilo Code extension from the VS Code marketplace, or search “Kilo Code” in VS Code Extensions.

Configure Wafer as a provider

Open Kilo Code and click the settings gear icon
Go to the Providers tab
Click Custom provider at the bottom
Fill in the dialog:

Provider ID: wafer
Display Name: Wafer
Base URL: https://pass.wafer.ai/v1
API Key: your Wafer API key
Model: Qwen3.5-397B-A17B

Click Save

If you’re on an older version of Kilo Code without the Providers tab, select OpenAI Compatible from the API Provider dropdown and enter the same Base URL, API key, and Model ID.

Start coding

Send a message in the Kilo Code panel to confirm the connection.

Set Up OpenHands

Install OpenHands

Follow the OpenHands installation guide. The quickest way:

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.44-nikolaik
docker run -it --rm \
  -p 3000:3000 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  docker.all-hands.dev/all-hands-ai/openhands:0.44

Configure Wafer as the LLM (UI)

Open the OpenHands UI (usually at http://localhost:3000)
Click the settings gear icon
Click Advanced to expand advanced options
Set these fields:

Custom Model: openai/GLM-5.1
Base URL: https://pass.wafer.ai/v1
API Key: your Wafer API key

The openai/ prefix is required. OpenHands uses litellm under the hood, and this prefix tells it to use the OpenAI-compatible completion path.

Alternative: config.toml

If you prefer file-based config, create or edit config.toml in the project root:

[llm]
model = "openai/GLM-5.1"
api_key = "YOUR_WAFER_API_KEY"
base_url = "https://pass.wafer.ai/v1"

Start coding

Open a conversation in the OpenHands UI to confirm the connection.

Set Up LibreChat

Install LibreChat

Follow the LibreChat installation guide. The quickest way is Docker:

git clone https://github.com/danny-avila/LibreChat.git
cd LibreChat
cp .env.example .env
docker compose up -d

Add Wafer as a provider

Edit librechat.yaml in the project root and add a Wafer endpoint:

endpoints:
  custom:
    - name: "Wafer"
      baseURL: "https://pass.wafer.ai/v1"
      apiKey: "${WAFER_API_KEY}"
      iconURL: "https://avatars.githubusercontent.com/u/213847495?s=200&v=4"
      models:
        default:
          - "Qwen3.5-397B-A17B"
          - "GLM-5.1"
        fetch: false
      titleConvo: true
      modelDisplayLabel: "Wafer"

Set your API key in your .env file:

WAFER_API_KEY=YOUR_WAFER_API_KEY

Replace YOUR_WAFER_API_KEY with your Wafer Pass API key.

Do not share your API key or commit it to version control.

The default list above exposes all Wafer models in LibreChat’s model picker. Drop any IDs you don’t want to surface.

Restart and verify

docker compose restart

Open LibreChat in your browser, select Wafer from the endpoint dropdown, and send a message.

Use Wafer with Other Harnesses

Most agent harnesses only need these settings: OpenAI-compatible harnesses (Cline, Roo Code, Kilo Code, OpenClaw, OpenHands, etc.):

Base URL: https://pass.wafer.ai/v1
Model: any ID from Models, e.g. GLM-5.1 or Qwen3.5-397B-A17B
Authentication: your Wafer Pass key (same token in Authorization: Bearer … headers or in the client’s API-key field — keys look like wfr_…)
Compatibility mode: OpenAI-compatible / OpenAI API

{
  "baseUrl": "https://pass.wafer.ai/v1",
  "apiKey": "YOUR_WAFER_API_KEY",
  "model": "GLM-5.1"
}

Pick any model string from Models. Anthropic-compatible harnesses (Claude Code, Conductor, or any tool using the Anthropic Messages API):

Base URL: https://pass.wafer.ai (the tool appends /v1/messages automatically)
Authentication: your Wafer API key via ANTHROPIC_API_KEY
Model: pin a Wafer model via ANTHROPIC_DEFAULT_*_MODEL (the harness’s default Anthropic model strings don’t match a Wafer model)

export ANTHROPIC_BASE_URL=https://pass.wafer.ai
export ANTHROPIC_API_KEY=YOUR_WAFER_API_KEY
export ANTHROPIC_DEFAULT_OPUS_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_SONNET_MODEL="GLM-5.1"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="GLM-5.1"
export CLAUDE_CODE_SUBAGENT_MODEL="GLM-5.1"

If the harness asks for a provider name, you can label it Wafer. Pass your Wafer Pass key wherever the harness expects an Anthropic/API key (ANTHROPIC_API_KEY).

Terms of Use

Wafer Pass is intended for personal agentic coding use only. By purchasing Wafer Pass you agree to the following:

Allowed: Personal development, experimentation, and coding with agentic harnesses (Claude Code, OpenCode, Cline, Kilo Code, OpenClaw, LangChain Deep Agents, and similar tools).
Prohibited: Production workloads, team or shared usage (one account per person, max 3 concurrent in-flight requests), reselling or pooling access, and any use that violates the Wafer Terms of Service.

Violations of these terms may result in pass revocation without refund. See the full Wafer Terms of Service for details.

FAQ

What models do I get?

Wafer Pass includes GLM-5.1 and Qwen3.5-397B-A17B. More models ship on the same subscription as we add them.

Can I use Wafer Pass with any model?

Use the Models IDs with https://pass.wafer.ai/v1.

Can I share my subscription?

How do I get access?

Sign up at wafer.ai/pass. Checkout is self-serve and your API key is shown immediately on the success page (and emailed as backup).

Do I need a special model ID?

For OpenAI-compatible harnesses, use a model string from Models with https://pass.wafer.ai/v1. Model names are case-insensitive for the IDs above. For Claude Code (Anthropic-compatible), pin a Wafer model via ANTHROPIC_DEFAULT_OPUS_MODEL / ANTHROPIC_DEFAULT_SONNET_MODEL / ANTHROPIC_DEFAULT_HAIKU_MODEL / CLAUDE_CODE_SUBAGENT_MODEL.

Why does Codex need a LiteLLM proxy?

Codex CLI removed support for the OpenAI Chat Completions wire format in February 2026 (wire_api = "chat" is no longer accepted) and now requires the OpenAI Responses API (/v1/responses). Wafer Pass exposes Chat Completions (/v1/chat/completions), so the local LiteLLM proxy translates between the two. See Set Up Codex for the config. The proxy is only needed for Codex — every other harness on this page (Claude Code, Conductor, OpenClaw, Cline, Roo Code, Kilo Code, Hermes Agent, OpenHands, LibreChat) talks to Wafer directly.

Will more models be added?

Yes. We’re optimizing the best coding models and adding them to the plan. Price stays the same.

Wafer Pass

Serverless

Dedicated Endpoints

Wafer Pass Setup

Connection Details

What’s Included

Models

API Capabilities

Pricing

Weekly

Monthly

Yearly (20% off the weekly rate)

Overage Pricing

Getting Started

Set Up Claude Code

Set Up Conductor

Set Up Codex

Set Up OpenClaw

Set Up Hermes Agent

Set Up Cline

Set Up Roo Code

Set Up Kilo Code

Set Up OpenHands

Set Up LibreChat

Use Wafer with Other Harnesses

Terms of Use

FAQ

Wafer Pass

Serverless

Dedicated Endpoints

Documentation Index

​Connection Details

​What’s Included

​Models

​API Capabilities

​Pricing

​Weekly

​Monthly

​Yearly (20% off the weekly rate)

​Overage Pricing

​Getting Started

​Set Up Claude Code

​Set Up Conductor

​Set Up Codex

​Set Up OpenClaw

​Set Up Hermes Agent

​Set Up Cline

​Set Up Roo Code

​Set Up Kilo Code

​Set Up OpenHands

​Set Up LibreChat

​Use Wafer with Other Harnesses

​Terms of Use

​FAQ

Connection Details

What’s Included

Models

API Capabilities

Pricing

Weekly

Monthly

Yearly (20% off the weekly rate)

Overage Pricing

Getting Started

Set Up Claude Code

Set Up Conductor

Set Up Codex

Set Up OpenClaw

Set Up Hermes Agent

Set Up Cline

Set Up Roo Code

Set Up Kilo Code

Set Up OpenHands

Set Up LibreChat

Use Wafer with Other Harnesses

Terms of Use

FAQ