ROCprofiler Compute
View and analyzerocprof-compute profiling results in VS Code or Cursor. See hardware metrics, roofline analysis, and kernel performance data without leaving your editor.
What is ROCprofiler Compute?
AMD’srocprof-compute (formerly Omniperf) profiles HIP kernels on AMD GPUs. It collects:
- Hardware counters — Per-block metrics (SQ, TCC, TA, TD, TCP, etc.)
- Roofline data — Arithmetic intensity vs. achieved performance
- Kernel timing — Execution duration and dispatch counts
- System info — GPU architecture, compute units, memory, clocks
Features
Architecture Diagram
Visual overview of GPU hardware blocks with real metrics from your profiling run.
Roofline Analysis
Plot kernels against L1, L2, HBM, and compute ceilings to identify bottlenecks.
Kernel Statistics
Top kernels by duration, dispatch lists, and per-kernel breakdowns.
System Info
Full hardware details: GPU model, CUs, cache sizes, clock speeds, ROCm version.
Local Viewer
GUI runs locally—view results from remote AMD machines on any platform.
Filtering
Filter by kernel name, dispatch ID, GCD, or normalization mode.
Requirements
| Requirement | Details |
|---|---|
| GUI Viewer | No installation required—works on Mac, Windows, Linux |
| Profiling | ROCm 7.0+ with rocprof-compute 3.2+ |
| Supported GPUs | gfx908, gfx90a, gfx940, gfx941, gfx942, gfx950 |
The GUI only reads CSV files from your profiling run. Profile on a remote AMD machine, copy the workload folder, and view results locally.
Quick Start
- Select ROCprofiler Compute from the Wafer tools menu
- Browse to your workload folder (must contain
sysinfo.csv) - Click Launch GUI
Analyzing Results
Learn how to use the analysis GUI →