Skip to main content

Welcome to Wafer

Wafer is a complete toolkit for GPU kernel development—a VS Code/Cursor extension, a powerful CLI, and an AI assistant that understands GPU programming. Profile kernels, analyze traces, evaluate correctness, and optimize performance without leaving your workflow.

Why Wafer?

GPU performance workflows are fragmented:
  • You profile in one tool, read counters you’re not sure how to prioritize
  • You inspect PTX/SASS somewhere else, with little context on what matters
  • You bounce between docs, blog posts, and guesses
  • Testing kernel correctness requires manual setup
  • If you’re developing remotely, you waste time (and money) keeping a GPU attached while you’re just editing code
Wafer pulls the loop into your editor and CLI, makes it repeatable, and adds AI assistance to help you understand what to optimize.

What You Get

AI-Powered Assistance

Ask questions about GPU programming, analyze traces, and get optimization suggestions:
wafer agent "How do I reduce bank conflicts in shared memory?"
wafer agent -t trace-analyze --args trace=./profile.ncu-rep "What's the bottleneck?"
wafer agent -t optimize-kernel --args kernel=./matmul.cu "Optimize for H100"

Kernel Evaluation

Test your kernels for correctness and benchmark performance:
wafer evaluate gpumode --impl ./kernel.py --reference ./reference.py --benchmark

Profiling Tools

Analyze performance with integrated NVIDIA and AMD profiling:
  • NCU Profiler — Nsight Compute reports with structured metrics and recommendations
  • Nsight Systems — System-wide profiling for CPU-GPU interaction
  • TraceLens — Performance reports and trace comparison
  • Perfetto — Visual timeline analysis with SQL queries
  • ROCprofiler — AMD kernel and system profiling
  • ISA Analysis — AMD GPU assembly analysis

Cloud GPU Access

Run evaluations and profiling on cloud GPUs without managing infrastructure:
wafer workspaces create --gpu H100 --name dev
wafer workspaces exec dev "wafer evaluate gpumode --impl kernel.py"

VS Code Integration

Open .ncu-rep reports directly in VS Code and get a structured view of what matters:
  • Kernel duration, compute/memory throughput, occupancy, register pressure
  • A clean “what to look at next” summary instead of a wall of counters
  • View PTX/SASS assembly from your profiled kernels
  • Daily CUDA challenges to practice GPU programming

Explore the Docs

Requirements

FeatureRequirements
CLIPython 3.8+ with pip install wafer-cli
AI AgentAuthentication via wafer auth login
NCU Analysis (Server)No installation required—upload reports to Wafer’s server
NCU Analysis (Local)Nsight Compute installed with ncu CLI on PATH
AMD ProfilingROCm installed with profiling tools
Perfetto TracesNo installation required—Perfetto runs in browser via WASM