Multimodal Large Language Models (MLLM)

#	Date	Title	Materials
1	Feb 11	Word Embeddings and Classification & Language Modelling	slides
1	Feb 11	Embeddings & CNN/LSTM LMs with PyTorch	notebook
2	Feb 18	Seq2seq, Attention, and Transformers	slides
2	Feb 18	Transformer from Scratch	notebook
3	Feb 25	Pretraining, SFT, RLHF & PEFT, LoRA	slides
3	Feb 25	Parameter-efficient fine-tuning	notebook
4	Mar 4	Reasoning, RLVF & RAG	slides
4	Mar 4	Tokenization	notebook
5	Mar 11	Efficient Inference: FlashAttention, KV cache, Distillation, Quantization
5	Mar 11
6	Mar 18	Introduction to MLLMs and Image Modality	slides
6	Mar 18	Classification of VLMs: Deep Fusion vs Early Fusion	notebook
7	Mar 25	VLLM and Data Generation	slides
7	Mar 25	Visual Autoregressive Transformer	notebook
8	Apr 1	Video Understanding	slides
8	Apr 1	Video Modality and Any-to-any Models	notebook
9	Apr 8	Action Modality (Robotics)
9	Apr 8	Intro to Vision Language Action Models	notebook
10	Apr 15	Multimodal Agents
10	Apr 15
11	Apr 22	3D Data Modality
11	Apr 22
12	Apr 29	Guest Lecture
12	Apr 29

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
01-w2v-cls-langmodeling		01-w2v-cls-langmodeling
02-seq2seq		02-seq2seq
03-llm		03-llm
04-reason-rag		04-reason-rag
06-images		06-images
07-generation		07-generation
08-video		08-video
09-vla		09-vla
.gitattributes		.gitattributes
FAQ.md		FAQ.md
README.md		README.md

Provide feedback