GPU Specs

GPU specs define hardware characteristics used for roofline analysis, performance modeling, and comparison. Wafer includes built-in specs for common GPUs and supports custom configurations.

Quick Start

# List available specs
wafer config specs list

# Show spec details
wafer config specs show H100

# Set default spec
wafer config specs default H100

Commands

wafer config specs list

List all available GPU specifications:

wafer config specs list

Output:

GPU Specifications:

NVIDIA:
  Name       Memory    BW (TB/s)   FP16 (TFLOPS)   TDP
  H100       80GB      3.35        989.4           700W
  H200       141GB     4.80        989.4           700W
  A100       80GB      2.04        312.0           400W
  B200       192GB     8.00        2250.0          1000W
  RTX4090    24GB      1.01        165.2           450W

AMD:
  Name       Memory    BW (TB/s)   FP16 (TFLOPS)   TDP
  MI300X     192GB     5.30        1307.4          750W
  MI250X     128GB     3.28        383.0           560W
  MI210      64GB      1.64        181.0           300W

* = default

wafer config specs show

Show detailed specifications for a GPU:

wafer config specs show <gpu-name>

Example:

wafer config specs show H100

Output:

NVIDIA H100 SXM5 (80GB)

Memory:
  Capacity: 80 GB HBM3
  Bandwidth: 3.35 TB/s
  Bus Width: 5120-bit

Compute:
  FP64: 33.5 TFLOPS
  FP32: 66.9 TFLOPS
  FP16: 989.4 TFLOPS (Tensor Core)
  BF16: 989.4 TFLOPS (Tensor Core)
  INT8: 1978.9 TOPS (Tensor Core)
  FP8: 1978.9 TFLOPS (Tensor Core)

Architecture:
  SMs: 132
  CUDA Cores: 16896
  Tensor Cores: 528 (4th gen)
  L2 Cache: 50 MB
  Registers/SM: 65536

Roofline:
  Ridge Point (FP16): 295.3 FLOP/byte
  Ridge Point (FP32): 20.0 FLOP/byte

Power:
  TDP: 700W

wafer config specs add

Add a custom GPU specification:

wafer config specs add <name> [OPTIONS]

Options:

Option	Description
`--memory`	Memory capacity (e.g., “80GB”)
`--bandwidth`	Memory bandwidth (e.g., “3.35TB/s”)
`--fp16`	FP16 peak TFLOPS
`--fp32`	FP32 peak TFLOPS
`--tdp`	Thermal design power (watts)

Example:

wafer config specs add my-custom-gpu \
  --memory 48GB \
  --bandwidth 2.0TB/s \
  --fp16 400 \
  --fp32 200 \
  --tdp 350

wafer config specs remove

Remove a custom specification:

wafer config specs remove <name>

Built-in GPU specs cannot be removed, only custom specs.

wafer config specs default

Set the default GPU for analysis:

wafer config specs default <name>

The default is used when no --gpu flag is specified:

# Set default
wafer config specs default H100

# Now roofline uses H100 automatically
wafer roofline --bytes 1e9 --flops 1e12 --time-ms 0.5

Using Specs

Roofline Analysis

# Use specific GPU
wafer roofline --gpu H100 --bytes 1e9 --flops 1e12 --time-ms 0.5

# Use default GPU
wafer roofline --bytes 1e9 --flops 1e12 --time-ms 0.5

Baseline Discovery

wafer baseline run "torch.matmul(A, B)" \
  --shape A=1024,1024 \
  --shape B=1024,1024 \
  --hardware H100

Performance Comparison

Compare same workload across GPUs:

# Analyze for H100
wafer roofline --gpu H100 --bytes 2e9 --flops 4e12 --time-ms 1.0

# Analyze for A100
wafer roofline --gpu A100 --bytes 2e9 --flops 4e12 --time-ms 1.5

Built-in Specifications

NVIDIA GPUs

GPU	Generation	Memory	Peak FP16
B200	Blackwell	192GB HBM3e	2250 TFLOPS
H200	Hopper	141GB HBM3e	989 TFLOPS
H100	Hopper	80GB HBM3	989 TFLOPS
A100	Ampere	80GB HBM2e	312 TFLOPS
RTX 4090	Ada	24GB GDDR6X	165 TFLOPS

AMD GPUs

GPU	Generation	Memory	Peak FP16
MI300X	CDNA 3	192GB HBM3	1307 TFLOPS
MI250X	CDNA 2	128GB HBM2e	383 TFLOPS
MI210	CDNA 2	64GB HBM2e	181 TFLOPS

Custom Spec File

Create specs from a YAML file:

# my-gpu.yaml
name: Custom GPU
vendor: NVIDIA
memory:
  capacity_gb: 80
  bandwidth_tb_s: 3.0
compute:
  fp16_tflops: 500
  fp32_tflops: 250
  fp64_tflops: 125
power:
  tdp_watts: 400

wafer config specs add --from-file my-gpu.yaml

Next Steps

Roofline Analysis

Use specs for roofline analysis.

Baseline Discovery

Use specs with baseline.

Targets

Configure GPU targets.

Workspaces

Access cloud GPUs.

Getting Started

CLI

AI Agent

Kernel Development

NVIDIA Profiling

NCU Profiler

Perfetto

AMD Profiling

ROCprofiler Compute

Infrastructure

Compare

Onboarding

More

GPU Specs

GPU Specs

Quick Start

Commands

wafer config specs list

wafer config specs show

wafer config specs add

wafer config specs remove

wafer config specs default

Using Specs

Roofline Analysis

Baseline Discovery

Performance Comparison

Built-in Specifications

NVIDIA GPUs

AMD GPUs

Custom Spec File

Next Steps

Roofline Analysis

Baseline Discovery

Targets

Workspaces

Getting Started

CLI

AI Agent

Kernel Development

NVIDIA Profiling

NCU Profiler

Perfetto

AMD Profiling

ROCprofiler Compute

Infrastructure

Compare

Onboarding

More

​GPU Specs

​Quick Start

​Commands

​wafer config specs list

​wafer config specs show

​wafer config specs add

​wafer config specs remove

​wafer config specs default

​Using Specs

​Roofline Analysis

​Baseline Discovery

​Performance Comparison

​Built-in Specifications

​NVIDIA GPUs

​AMD GPUs

​Custom Spec File

​Next Steps

Roofline Analysis

Baseline Discovery

Targets

Workspaces

GPU Specs

Quick Start

Commands

wafer config specs list

wafer config specs show

wafer config specs add

wafer config specs remove

wafer config specs default

Using Specs

Roofline Analysis

Baseline Discovery

Performance Comparison

Built-in Specifications

NVIDIA GPUs

AMD GPUs

Custom Spec File

Next Steps