Daily Kernel
The Daily Kernel is a daily CUDA programming challenge that helps you practice GPU programming skills. Each day features a new problem with varying difficulty levels.What is it?
Similar to daily coding challenges you might find on other platforms, the Daily Kernel presents GPU-specific problems:- Kernel optimization — Make a kernel faster
- Algorithm implementation — Implement GPU-friendly algorithms
- Memory patterns — Work with different memory types
- Parallel primitives — Reductions, scans, and more
Accessing Daily Kernel
- Click the puzzle icon (⚡) in the Wafer top bar, or
- Select Daily Kernel from the tool dropdown
Challenge Structure
Each challenge includes:Problem Statement
A description of what you need to implement or optimize, including:- Input/output specifications
- Performance requirements
- Constraints
Examples
Concrete examples showing:- Sample inputs
- Expected outputs
- Explanations of the expected behavior
Framework Selection
Choose your preferred implementation framework:- CuTe DSL — Modern C++ DSL for tensor operations
- CUDA — Standard CUDA C++
Different frameworks may have different starter code and hints tailored to that approach.
Starter Code
Template code to get you started:- Function signatures
- Memory setup
- Basic structure
Kernel Signature
For kernels, you’ll see:- Input tensors (names, types, shapes)
- Output tensors
- Any scalar parameters
Constraints
Problem constraints to keep in mind:- Input sizes
- Performance targets
- Memory limits
Hints
Collapsible hints if you get stuck:- Algorithmic approaches
- Framework-specific tips
- Common pitfalls to avoid
Starting a Challenge
1
Read the Problem
Understand what you need to implement. Pay attention to input/output specs and constraints.
2
Choose a Framework
Select CuTe DSL or CUDA based on your preference and the problem type.
3
Review Starter Code
Look at the provided template to understand the expected structure.
4
Click Start Challenge
This creates a new file in your workspace with the starter code.
5
Implement Your Solution
Write your kernel implementation in the created file.
Difficulty Levels
| Level | Description |
|---|---|
| Easy | Straightforward implementations, good for learning basics |
| Medium | Requires optimization or non-trivial algorithms |
| Hard | Complex problems requiring advanced techniques |
Tips for Success
Start Simple
Start Simple
Get a correct solution first, then optimize. Don’t try to write the fastest solution immediately.
Use the Compiler Explorer
Use the Compiler Explorer
Check the PTX/SASS output of your kernel to understand what’s happening at the instruction level.
Profile with NCU
Profile with NCU
Use the NCU Profiler to identify bottlenecks in your solution.
Ask GPU Docs
Ask GPU Docs
If you’re stuck on a concept, ask the GPU Docs assistant for help.