My fun collection of notebooks for finetuning AI models
-
medgemma-4b-it-finetune-sft-lite.ipynb
A very lightweight supervised fine-tuning (SFT) example on MedGemma-4B-IT.
Perfect for quick experiments, small-scale training, or learning the basics of finetuning with medical-oriented LLMs.
Dataset: RexVQA, a dataset containing questions about medical images.rexvqa -
gemma3-270m-dpo-safety.ipynb
Direct Preference Optimization (DPO) fine-tuning example on Gemma3-270M for safety alignment.
Demonstrates how to train models to avoid harmful and sensitive responses using preference-based learning.
Dataset: Anthropic/hh-rlhf, a dataset containing rejected and chosen conversations.More notebooks coming soon... 🚧
Got a fun notebook? Found a finetuning trick? 👉 PRs are welcome! Let’s make this the Spotify of AI finetunes 🎧