In this repository, we provide a complete implementation of the training and inference pipeline for LLaDA, applied to arithmetic operations and sorting tasks.
Install the dependencies:
conda env create -f environment.yaml
conda activate llm_projectTo run train and run LLaDA:
python src/main.py --method llada --tokenizer group_pad --num_epochs 5 --number_bits 20 --device cpu --data_size 64000 --batch_size 32 --learning_rate 5e-4 --seq_length 21To train the model using Kaggle's GPU, ensure you have a Kaggle account and API key, adapt the kaggle/kernel-metadata.json file to your Kaggle username, and run:
kaggle kernels push -p kaggle/We would like to thank:
- The authors of the paper Large Language Diffusion Models from which we are basing this project.
This project is conducted by: Nicolas Sereyjol-Garros, Tom Ravaud, Christopher Marouani, and Lounès Meddahi.
