Summary
Define explicit benchmark suites with kernel-specific shape grids, including a long suite for large-N performance.
Motivation / Use Case
We want deterministic, comparable benchmark runs across kernels and GPUs.
Proposed Solution
- Add
short and long suites in bench/suites.yaml.
- Each kernel declares its own shape list:
- reduce_sum and softmax_online use dim=-1 shapes from Issues 1 and 2.
- copy_transpose uses shapes from Issue 3.
- Document usage in README/DEVELOPMENT.
Scope Alignment
v0.1 scope (Weeks 0-2)
Alternatives Considered
Keep only a single smoke suite.
Additional Context
Long suite is optional and benchmark-only.
Summary
Define explicit benchmark suites with kernel-specific shape grids, including a long suite for large-N performance.
Motivation / Use Case
We want deterministic, comparable benchmark runs across kernels and GPUs.
Proposed Solution
shortandlongsuites inbench/suites.yaml.Scope Alignment
v0.1 scope (Weeks 0-2)
Alternatives Considered
Keep only a single smoke suite.
Additional Context
Long suite is optional and benchmark-only.