Skip to content

Lacico/FeynTune

Repository files navigation

Installation

A ready-made environment is provided with Docker. So, on Linux or WSL(v2), make sure you have installed Docker with Docker Compose.

Commands for installation and environment management have been set up in the Makefile.

Install environment (GPU support included):

make install

Launch container and start jupyter lab running in environment

To launch (and start a notebook server in the background) you can run (CPU only):

make up

Or:

make up-gpu

Once it's launched, retrieve the notebook link by running make logs (can take a few seconds to appear in the container log).

Development

A .devcontainer is provided, which should allow you to properly develop with full IDE support form inside the container in VSCode. You can enable this by installing VSCode Remote Containers extension and choosing "open project in container".

Libraries

HuggingFace Accelerate

DeepSpeed

BitsAndBytes

PEFT

GitHub: QLoRA

FastChat

SpQR paper implementation

Llama Recipes

GGML (post-training quantization and inference, CPU-focused)

GPTQ (post-training quantization, GPU)

Fine Tuning Language Models with Just Forward Passes (code)

Videos

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

PEFT LoRA Explained in Detail - Fine-Tune your LLM on your local GPU

Boost Fine-Tuning Performance of LLM: Optimal Architecture w/ PEFT LoRA Adapter-Tuning on Your GPU

Blog Posts

Anyscale Blog: Fine-tuning Llama 2

Medium: Easily Finetune Llama 2 for Your Text-to-SQL Applications

GitHub: Modal Finetune SQL Tutorial

Medium: Easily Finetune Llama 2 for Your Text-to-SQL Applications

GitHub: Ray Project - Finetuning LLMS with Deepspeed

Codehammer: How to Load Llama 13B for Inference on a Single RTX 4090

Storm in the Castle: Alpaca 13B

Papers

QLoRA paper

SpQR - Sparse-Quantized Representation (to be integrated in BitsAndBytes)

Fine Tuning Language Models with Just Forward Passes

Running Code

You will need to export an environment variable: HUGGINGFACE_KEY=<your_key> and then you can use the build, run and lint commands defined in the makefile e.g. make build. The docker container uses multiple stages to cache models from huggingface such that they aren't downloaded for every code change. To add more models to the cache simply add a line to the ModelPaths class in the model_paths.py module.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors