Summary
Close v0.1 by making main opinionated and harness-first, with Week 0–2 ops present (implemented or explicit ref-backed stubs) and docs aligned with actual support.
Scope
- Week 0: copy/transpose (minimal working kernel)
- Week 1: reduce_sum (ref-backed stub is acceptable)
- Week 2: softmax_online (ref-backed stub is acceptable)
- Harness and docs are the primary deliverable
Out of Scope for v0.1
- FA / KV-cache / FP8 / NCCL / C++ extension builds
- Aggressive perf tuning beyond correctness + benchmark scaffolding
Checklist
- Define “opinionated main” policy in README or DEVELOPMENT
- Harness commands succeed on supported GPU:
uv run python -m forge_cute_py.env_check
uv run pytest -q
uv run python bench/run.py --suite smoke
- Copy/transpose: minimal working kernel + correctness tests
- reduce_sum: explicit ref-backed stub (documented)
- softmax_online: explicit ref-backed stub (documented)
- Update README kernel status + ROADMAP to reflect v0.1 closure
- Update CHANGELOG for v0.1 release notes
- Resolve code organization guideline
- Fix testing command docs
Sub-Issues
Testing & Benchmarking
Documentation & Policy
Harness & Week 1 Work
Acceptance Criteria
- All checklist items complete
- Docs clearly state what is implemented vs ref-backed
- Harness is the gate for correctness
- No WIP-only code on
main
Summary
Close v0.1 by making
mainopinionated and harness-first, with Week 0–2 ops present (implemented or explicit ref-backed stubs) and docs aligned with actual support.Scope
Out of Scope for v0.1
Checklist
uv run python -m forge_cute_py.env_checkuv run pytest -quv run python bench/run.py --suite smokeSub-Issues
Testing & Benchmarking
Documentation & Policy
Harness & Week 1 Work
Acceptance Criteria
main