Personal notes on the math, algorithms, and systems behind modern ML — worked derivations, algorithm walkthroughs, and engineering plumbing for training and running large models.
Site: yueyiming2009.github.io/ai-learning
docs/ Rendered HTML pages (published via GitHub Pages)
triton/ Triton GPU kernel notebooks
rl/ Reinforcement-learning systems notes
The site index is organized into three top-level domains.
01 · Math — docs/
Worked derivations from scratch: gradients, probabilistic principles, and the algebra behind common losses, layers, and RL objectives.
| Page | Topic |
|---|---|
| Math notation | Symbol-by-symbol reference for ML papers |
| NLL loss | The maximum-likelihood principle behind CE, MSE, BCE, DPO |
| Cross entropy | Softmax + CE forward/backward; the p − y gradient |
| LayerNorm gradient | Full backward through mean, variance, normalized input |
| RMSNorm gradient | Two-term dx derivation; side-by-side with LayerNorm |
| BatchNorm gradient | Backward across the batch dim; train vs. inference |
| SwiGLU | Gated FFN forward/backward; weight grads for all three projections |
| RoPE gradient | Per-pair rotation forward + symmetric backward |
| RLHF | Bradley–Terry RM, KL-regularized RL, policy gradient, GAE, PPO |
| GRPO | DeepSeek-R1's critic-free PPO with group-relative advantages |
| DPO | KL-constrained RLHF → closed-form policy → logistic loss |
02 · Algorithms — docs/
Procedures and data structures — how core ML algorithms are actually implemented when correctness and speed both matter.
| Page | Topic |
|---|---|
| Production BPE | Incremental pair counts, reverse index, linked-list + heap encoder |
Engineering notes — GPU kernels, distributed training topology, and the plumbing that turns the math into a working training run.
| Page | Topic |
|---|---|
| Triton — Introduction | Program IDs, block pointers, masking, the launch grid |
| Fused Softmax kernel | Row-wise online softmax in one fused Triton kernel |
| PPO training walkthrough (verl) | One PPO step on LLaMA-7B/8 GPUs: every shape, every NCCL call, TP/DP/PP |
The HTML pages under docs/ are published via GitHub Pages
(Settings → Pages → main / /docs). Notebooks and markdown files render
directly on GitHub.