ai-learning

Personal notes on the math, algorithms, and systems behind modern ML — worked derivations, algorithm walkthroughs, and engineering plumbing for training and running large models.

Site: yueyiming2009.github.io/ai-learning

Page	Topic
Math notation	Symbol-by-symbol reference for ML papers
NLL loss	The maximum-likelihood principle behind CE, MSE, BCE, DPO
Cross entropy	Softmax + CE forward/backward; the `p − y` gradient
LayerNorm gradient	Full backward through mean, variance, normalized input
RMSNorm gradient	Two-term `dx` derivation; side-by-side with LayerNorm
BatchNorm gradient	Backward across the batch dim; train vs. inference
SwiGLU	Gated FFN forward/backward; weight grads for all three projections
RoPE gradient	Per-pair rotation forward + symmetric backward
RLHF	Bradley–Terry RM, KL-regularized RL, policy gradient, GAE, PPO
GRPO	DeepSeek-R1's critic-free PPO with group-relative advantages
DPO	KL-constrained RLHF → closed-form policy → logistic loss

02 · Algorithms — docs/

Procedures and data structures — how core ML algorithms are actually implemented when correctness and speed both matter.

Page	Topic
Production BPE	Incremental pair counts, reverse index, linked-list + heap encoder

03 · Systems — triton/, rl/

Engineering notes — GPU kernels, distributed training topology, and the plumbing that turns the math into a working training run.

Page	Topic
Triton — Introduction	Program IDs, block pointers, masking, the launch grid
Fused Softmax kernel	Row-wise online softmax in one fused Triton kernel
PPO training walkthrough (verl)	One PPO step on LLaMA-7B/8 GPUs: every shape, every NCCL call, TP/DP/PP

Publishing

The HTML pages under docs/ are published via GitHub Pages (Settings → Pages → main / /docs). Notebooks and markdown files render directly on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
docs		docs
rl		rl
triton		triton
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai-learning

Contents

01 · Math — docs/

02 · Algorithms — docs/

03 · Systems — triton/, rl/

Publishing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ai-learning

Contents

01 · Math — docs/

02 · Algorithms — docs/

03 · Systems — triton/, rl/

Publishing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages