Nsight Systems
NVIDIA Nsight Systems (nsys) provides system-wide profiling to understand CPU-GPU interaction, kernel launches, memory transfers, and overall application behavior. Wafer integrates nsys for easy profiling and analysis.Quick Start
Commands
wafer nvidia nsys check
Verify nsys installation and version:wafer nvidia nsys profile
Capture a system-wide profile:| Option | Description |
|---|---|
--output, -o | Output file path (default: profile.nsys-rep) |
--trace | What to trace: cuda, nvtx, osrt, cudnn, etc. |
--duration | Maximum capture duration in seconds |
--delay | Delay before starting capture |
--sample | Enable CPU sampling |
--target, -t | Run on remote GPU target |
wafer nvidia nsys analyze
Analyze an nsys trace file:| Option | Description |
|---|---|
--summary | Show summary statistics |
--kernels | List kernel execution times |
--transfers | Show memory transfer analysis |
--json | Output as JSON |
What Nsys Captures
Nsight Systems traces multiple aspects of your application:- CUDA API calls: cudaMemcpy, cudaMalloc, kernel launches
- GPU kernels: Execution time, grid/block dimensions
- Memory transfers: H2D, D2H, D2D copies
- CPU activity: Thread scheduling, function calls
- NVTX markers: Custom annotations in your code
- cuDNN/cuBLAS: Library call timings
Adding NVTX Markers
Annotate your code for better trace visibility:Comparing with NCU
| Aspect | Nsys | NCU |
|---|---|---|
| Scope | System-wide | Single kernel |
| Overhead | Low | High |
| Detail | Timeline, API calls | Hardware counters |
| Use case | Find bottlenecks | Optimize kernels |
Remote Profiling
Profile on remote GPU targets:Troubleshooting
nsys not found
nsys not found
Install NVIDIA Nsight Systems:
- Download from NVIDIA Developer
- Or install with CUDA Toolkit
- Ensure
nsysis on your PATH
Permission denied
Permission denied
On Linux, you may need elevated privileges:Or configure paranoid level:
Trace file too large
Trace file too large
Limit capture duration or scope: