Skip to content

emb-ai/mllm-course

Repository files navigation

Multimodal Large Language Models (MLLM)

Rules & FAQ

Class recordings (in Russian)

YouTube VKVideo

Class Materials

# Date Title Materials
1 Feb 11 Word Embeddings and Classification & Language Modelling slides
Embeddings & CNN/LSTM LMs with PyTorch notebook
2 Feb 18 Seq2seq, Attention, and Transformers slides
Transformer from Scratch notebook
3 Feb 25 Pretraining, SFT, RLHF & PEFT, LoRA slides
Parameter-efficient fine-tuning notebook
4 Mar 4 Reasoning, RLVF & RAG slides
Tokenization notebook
5 Mar 11 Efficient Inference: FlashAttention, KV cache, Distillation, Quantization
6 Mar 18 Introduction to MLLMs and Image Modality slides
Classification of VLMs: Deep Fusion vs Early Fusion notebook
7 Mar 25 VLLM and Data Generation slides
Visual Autoregressive Transformer notebook
8 Apr 1 Video Understanding slides
Video Modality and Any-to-any Models notebook
9 Apr 8 Action Modality (Robotics)
Intro to Vision Language Action Models notebook
10 Apr 15 Multimodal Agents
11 Apr 22 3D Data Modality
12 Apr 29 Guest Lecture

About

Multimodal LLMs course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors