Skip to content

yueyiming2009/ai-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ai-learning

Personal notes on the math, algorithms, and systems behind modern ML — worked derivations, algorithm walkthroughs, and engineering plumbing for training and running large models.

Site: yueyiming2009.github.io/ai-learning

Contents

docs/        Rendered HTML pages (published via GitHub Pages)
triton/      Triton GPU kernel notebooks
rl/          Reinforcement-learning systems notes

The site index is organized into three top-level domains.

01 · Math — docs/

Worked derivations from scratch: gradients, probabilistic principles, and the algebra behind common losses, layers, and RL objectives.

Page Topic
Math notation Symbol-by-symbol reference for ML papers
NLL loss The maximum-likelihood principle behind CE, MSE, BCE, DPO
Cross entropy Softmax + CE forward/backward; the p − y gradient
LayerNorm gradient Full backward through mean, variance, normalized input
RMSNorm gradient Two-term dx derivation; side-by-side with LayerNorm
BatchNorm gradient Backward across the batch dim; train vs. inference
SwiGLU Gated FFN forward/backward; weight grads for all three projections
RoPE gradient Per-pair rotation forward + symmetric backward
RLHF Bradley–Terry RM, KL-regularized RL, policy gradient, GAE, PPO
GRPO DeepSeek-R1's critic-free PPO with group-relative advantages
DPO KL-constrained RLHF → closed-form policy → logistic loss

02 · Algorithms — docs/

Procedures and data structures — how core ML algorithms are actually implemented when correctness and speed both matter.

Page Topic
Production BPE Incremental pair counts, reverse index, linked-list + heap encoder

03 · Systems — triton/, rl/

Engineering notes — GPU kernels, distributed training topology, and the plumbing that turns the math into a working training run.

Page Topic
Triton — Introduction Program IDs, block pointers, masking, the launch grid
Fused Softmax kernel Row-wise online softmax in one fused Triton kernel
PPO training walkthrough (verl) One PPO step on LLaMA-7B/8 GPUs: every shape, every NCCL call, TP/DP/PP

Publishing

The HTML pages under docs/ are published via GitHub Pages (Settings → Pages → main / /docs). Notebooks and markdown files render directly on GitHub.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors