Creating Traces for Perfetto

Perfetto can analyze traces from many sources. Here’s how to create traces that work with Wafer’s Perfetto viewer.

Supported Formats

Wafer’s Perfetto viewer supports:

Format	Extension	Description
Chrome JSON	`.json`	Standard Chrome tracing format
Gzip Compressed	`.json.gz`, `.gz`	Compressed Chrome JSON traces
Perfetto Native	`.perfetto-trace`, `.pftrace`	Native Perfetto format

Chrome JSON format is the most common and widely supported. Most profiling tools can export to this format.

Creating Chrome JSON Traces

From PyTorch

Use PyTorch’s built-in profiler to generate traces:

import torch
from torch.profiler import profile, ProfilerActivity

with profile(
    activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
    record_shapes=True,
    with_stack=True,
) as prof:
    # Your model code here
    model(input)

# Export to Chrome JSON format
prof.export_chrome_trace("trace.json")

From TensorFlow

TensorFlow can export traces directly:

import tensorflow as tf

# Enable tracing
tf.profiler.experimental.start('logdir')

# Your model code here
model(input)

# Stop and save
tf.profiler.experimental.stop()

Then convert the TensorBoard logs to Chrome JSON using TensorBoard.

From Chrome DevTools

For web-based GPU work (WebGL, WebGPU):

Open Chrome DevTools (F12)
Go to the Performance tab
Click Record (⚫)
Perform your operations
Click Stop
Click Save profile to export as JSON

From NVIDIA Tools

Using Nsight Systems

Nsight Systems can export to Chrome JSON:

# Profile and generate a .nsys-rep file
nsys profile -o output ./my_app

# Export to JSON (requires nsys-exporter)
nsys export --type=json --output=trace.json output.nsys-rep

Using PyTorch + CUDA

PyTorch’s profiler captures CUDA events automatically:

from torch.profiler import profile, ProfilerActivity

with profile(
    activities=[
        ProfilerActivity.CPU,
        ProfilerActivity.CUDA,
    ],
    with_stack=True,
    profile_memory=True,
) as prof:
    model(input)

prof.export_chrome_trace("cuda_trace.json")

From Custom Instrumentation

You can create Chrome JSON traces manually. The format is simple:

{
  "traceEvents": [
    {
      "name": "MyFunction",
      "cat": "kernel",
      "ph": "X",
      "ts": 1000,
      "dur": 500,
      "pid": 1,
      "tid": 1
    }
  ]
}

Field	Description
`name`	Event name
`cat`	Category (for coloring/filtering)
`ph`	Phase: `X` for complete events, `B`/`E` for begin/end pairs
`ts`	Timestamp in microseconds
`dur`	Duration in microseconds (for `X` events)
`pid`	Process ID
`tid`	Thread ID

Compressing Traces

Large traces can be compressed to save space:

# Compress with gzip
gzip trace.json

# This creates trace.json.gz
# Perfetto can read it directly

Traces can get very large (100MB+). Always compress traces before storing or sharing them. Wafer’s Perfetto viewer handles compressed files natively.

Best Practices

Keep traces focused

Profile only the code you care about. A trace of your entire application startup will be harder to analyze than a trace of just the hot loop.

Include metadata

Add git commit hashes, configuration, and environment info to trace filenames or metadata. This makes it easier to reproduce and compare results.

Compress before uploading

Large traces take longer to upload and process. Compress with gzip first—Perfetto handles decompression automatically.

Profile realistic workloads

Use representative input sizes and data patterns. Profiling toy inputs may not reveal real-world bottlenecks.

Troubleshooting

Trace file is too large

Try reducing the profiling duration or scope. You can also use gzip to compress the file—Perfetto reads .json.gz files directly.

No CUDA events in trace

Make sure you’re including ProfilerActivity.CUDA in your profiler configuration. Also ensure CUDA synchronization happens before the profiler context ends.

Trace loads but shows no data

Check that your trace has valid traceEvents. Some tools export wrapper formats that need conversion.

Next Steps

Analyzing Traces

Learn how to analyze your traces in Perfetto.

Getting Started

CLI

AI Agent

Kernel Development

NVIDIA Profiling

NCU Profiler

Perfetto

AMD Profiling

ROCprofiler Compute

Infrastructure

Compare

Onboarding

More

Creating Traces

Creating Traces for Perfetto

Supported Formats

Creating Chrome JSON Traces

From PyTorch

From TensorFlow

From Chrome DevTools

From NVIDIA Tools

Using Nsight Systems

Using PyTorch + CUDA

From Custom Instrumentation

Compressing Traces

Best Practices

Troubleshooting

Next Steps

Analyzing Traces

Getting Started

CLI

AI Agent

Kernel Development

NVIDIA Profiling

NCU Profiler

Perfetto

AMD Profiling

ROCprofiler Compute

Infrastructure

Compare

Onboarding

More

​Creating Traces for Perfetto

​Supported Formats

​Creating Chrome JSON Traces

​From PyTorch

​From TensorFlow

​From Chrome DevTools

​From NVIDIA Tools

​Using Nsight Systems

​Using PyTorch + CUDA

​From Custom Instrumentation

​Compressing Traces

​Best Practices

​Troubleshooting

​Next Steps

Analyzing Traces

Creating Traces for Perfetto

Supported Formats

Creating Chrome JSON Traces

From PyTorch

From TensorFlow

From Chrome DevTools

From NVIDIA Tools

Using Nsight Systems

Using PyTorch + CUDA

From Custom Instrumentation

Compressing Traces

Best Practices

Troubleshooting

Next Steps