Kernel Evaluation
Thewafer evaluate command tests your GPU kernel implementations for correctness and optionally benchmarks their performance. It supports two kernel formats: GPUMode and KernelBench.
Quick Start
Formats
GPUMode Format
GPUMode kernels definecustom_kernel and ref_kernel functions:
KernelBench Format
KernelBench uses aModelNew class that replaces Model:
Commands
wafer evaluate gpumode
| Option | Short | Description |
|---|---|---|
--impl | -i | Path to implementation kernel file |
--reference | Path to reference kernel file | |
--test-cases | Path to test cases JSON file | |
--target | -t | GPU target name |
--benchmark | Run performance benchmarks | |
--profile | Enable profiling | |
--defense/--no-defense | Run reward hack defense checks (default: enabled) | |
--gpu-id | Override GPU ID |
wafer evaluate kernelbench
gpumode, plus format-specific options.
Example:
wafer evaluate make-template
Generate template files to get started:Test Cases
Provide custom test cases via JSON:Defense Checks
By default, evaluation includes defense checks to detect potential reward hacking:- Verifies implementation doesn’t just copy the reference
- Checks for meaningful computation
- Validates output shapes and dtypes
--no-defense if needed for debugging.
Remote Evaluation
Run on remote GPUs using targets:Output
Successful evaluation shows:Next Steps
Baseline Discovery
See what kernels PyTorch dispatches to.
Roofline Analysis
Analyze performance against hardware limits.
AI Agent
Get AI help optimizing your kernels.
Profiling
Profile your kernels with NCU and nsys.