Skip to main content

ROCprofiler Compute

View and analyze rocprof-compute profiling results in VS Code or Cursor. See hardware metrics, roofline analysis, and kernel performance data without leaving your editor.

What is ROCprofiler Compute?

AMD’s rocprof-compute (formerly Omniperf) profiles HIP kernels on AMD GPUs. It collects:
  • Hardware counters — Per-block metrics (SQ, TCC, TA, TD, TCP, etc.)
  • Roofline data — Arithmetic intensity vs. achieved performance
  • Kernel timing — Execution duration and dispatch counts
  • System info — GPU architecture, compute units, memory, clocks
Wafer’s ROCprofiler Compute tool provides an interactive GUI for exploring this data.

Features

Architecture Diagram

Visual overview of GPU hardware blocks with real metrics from your profiling run.

Roofline Analysis

Plot kernels against L1, L2, HBM, and compute ceilings to identify bottlenecks.

Kernel Statistics

Top kernels by duration, dispatch lists, and per-kernel breakdowns.

System Info

Full hardware details: GPU model, CUs, cache sizes, clock speeds, ROCm version.

Local Viewer

GUI runs locally—view results from remote AMD machines on any platform.

Filtering

Filter by kernel name, dispatch ID, GCD, or normalization mode.

Requirements

RequirementDetails
GUI ViewerNo installation required—works on Mac, Windows, Linux
ProfilingROCm 7.0+ with rocprof-compute 3.2+
Supported GPUsgfx908, gfx90a, gfx940, gfx941, gfx942, gfx950
The GUI only reads CSV files from your profiling run. Profile on a remote AMD machine, copy the workload folder, and view results locally.

Quick Start

  1. Select ROCprofiler Compute from the Wafer tools menu
  2. Browse to your workload folder (must contain sysinfo.csv)
  3. Click Launch GUI

Analyzing Results

Learn how to use the analysis GUI →