This document serves as a log of the progress and knowledge gained while writing CUDA kernels and studying the PMPP (Parallel Programming and Optimization) book.
Structure taken from a-hamdi's repository
Summary:
Learned:
- Fill this in
- Fill this in
- Read Chapter 1 and Chapter 2 of the PMPP book.
Future challenges:
Day 5 - mandatory Flash Attention 2: Forward Day 10 - mandatory Flash Attention 2: Backward