Welcome to Wafer
Wafer is a VS Code / Cursor extension that brings GPU kernel development into your editor. Profile with Nsight Compute, inspect PTX/SASS assembly, jump to the right docs, and iterate faster—all without leaving your IDE.NCU Profiler
Analyze Nsight Compute reports directly in VS Code. View kernel metrics, bottlenecks, and optimization recommendations.
Compiler Explorer
Compile CUDA files and inspect PTX/SASS assembly output. Understand what your code compiles to.
GPU Docs
Chat with CUDA, PTX, CuTe DSL, and CUTLASS documentation. Get answers with citations.
Quickstart
Get up and running in under 5 minutes.
Why Wafer?
GPU performance workflows are fragmented:- You profile in one tool, read counters you’re not sure how to prioritize
- You inspect PTX/SASS somewhere else, with little context on what matters
- You bounce between docs, blog posts, and guesses
- If you’re developing remotely, you waste time (and money) keeping a GPU attached while you’re just editing code
What You Get
Nsight Compute Report Analysis
Open.ncu-rep reports directly in VS Code and get a structured view of what matters:
- Kernel duration, compute/memory throughput, occupancy, register pressure signals
- A clean “what to look at next” summary instead of a wall of counters
- Exportable text reports for issues, PRs, or other tools
PTX / SASS Viewer
See what your kernel compiled into without leaving your editor:- Jump from kernel code to generated PTX/SASS
- Spot common issues: memory access patterns, control flow, register pressure hints
- Keep low-level output tied to the source that produced it
GPU Docs Agent
A docs assistant for when you’re stuck or unsure what a metric or instruction means:- CUTLASS and CuTe DSL concepts, layouts, and tensor core paths
- PTX ISA navigation including modern MMA paths
- Multi-turn Q&A with citations so you can verify claims
Requirements
| Feature | Requirements |
|---|---|
| NCU Analysis | Nsight Compute installed with ncu CLI on PATH |
| Compiler Explorer | CUDA Toolkit with nvcc on PATH |
| SASS Output | nvdisasm on PATH (included with CUDA Toolkit) |
| GPU Docs | Works out of the box—no local requirements |