Welcome to Wafer
Wafer is a complete toolkit for GPU kernel development—a VS Code/Cursor extension, a powerful CLI, and an AI assistant that understands GPU programming. Profile kernels, analyze traces, evaluate correctness, and optimize performance without leaving your workflow.Quickstart
Get up and running in under 5 minutes.
CLI Overview
Install and use the Wafer command-line interface.
AI Agent
Get AI help with kernel optimization and GPU questions.
Kernel Development
Evaluate, profile, and optimize your kernels.
Why Wafer?
GPU performance workflows are fragmented:- You profile in one tool, read counters you’re not sure how to prioritize
- You inspect PTX/SASS somewhere else, with little context on what matters
- You bounce between docs, blog posts, and guesses
- Testing kernel correctness requires manual setup
- If you’re developing remotely, you waste time (and money) keeping a GPU attached while you’re just editing code
What You Get
AI-Powered Assistance
Ask questions about GPU programming, analyze traces, and get optimization suggestions:Kernel Evaluation
Test your kernels for correctness and benchmark performance:Profiling Tools
Analyze performance with integrated NVIDIA and AMD profiling:- NCU Profiler — Nsight Compute reports with structured metrics and recommendations
- Nsight Systems — System-wide profiling for CPU-GPU interaction
- TraceLens — Performance reports and trace comparison
- Perfetto — Visual timeline analysis with SQL queries
- ROCprofiler — AMD kernel and system profiling
- ISA Analysis — AMD GPU assembly analysis
Cloud GPU Access
Run evaluations and profiling on cloud GPUs without managing infrastructure:VS Code Integration
Open.ncu-rep reports directly in VS Code and get a structured view of what matters:
- Kernel duration, compute/memory throughput, occupancy, register pressure
- A clean “what to look at next” summary instead of a wall of counters
- View PTX/SASS assembly from your profiled kernels
- Daily CUDA challenges to practice GPU programming
Explore the Docs
NVIDIA Profiling
NCU, Nsight Systems, TraceLens, and Perfetto.
AMD Profiling
ISA analysis, ROCprofiler SDK, and ROCprofiler Systems.
Infrastructure
Workspaces, targets, and remote GPU access.
Agent Templates
Pre-configured AI workflows for common tasks.
Requirements
| Feature | Requirements |
|---|---|
| CLI | Python 3.8+ with pip install wafer-cli |
| AI Agent | Authentication via wafer auth login |
| NCU Analysis (Server) | No installation required—upload reports to Wafer’s server |
| NCU Analysis (Local) | Nsight Compute installed with ncu CLI on PATH |
| AMD Profiling | ROCm installed with profiling tools |
| Perfetto Traces | No installation required—Perfetto runs in browser via WASM |