LLM fine-tuning for procurement domain using LoRA (Low-Rank Adaptation).
- Base Model: Phi-3-mini-4k-instruct
- Method: QLoRA (4-bit quantization + LoRA)
- Framework: Transformers, PEFT, TRL
- Inference: Ollama (GGUF format)
# 1. Setup
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 2. Prepare data
python generate_training_data.py
# 3. Train
python train.py
# 4. Merge & Convert
python merge_model.py
./convert_gguf.sh
# 5. Deploy
ollama create procurement -f Modelfile
ollama run procurement├── sample_data.json # Input: procurement records
├── generate_training_data.py # Data → JSONL converter
├── train.py # LoRA fine-tuning
├── merge_model.py # LoRA + base → merged model
├── convert_gguf.sh # HF → GGUF conversion
├── convert_to_ollama.sh # Full pipeline script
├── Modelfile # Ollama config
└── INSTRUCTIONS.md # Detailed convert guide
Input (sample_data.json):
{
"contractor": "Supplier Name",
"products": [
{"name": "Product", "unit_price": 100, "quantity": 10}
]
}Output (train_data.jsonl):
{"instruction": "Find me Product X", "input": "", "output": "Supplier: ..."}| Parameter | Value |
|---|---|
| LoRA rank | 64 |
| LoRA alpha | 128 |
| Epochs | 3 |
| Batch size | 4 |
| Learning rate | 2e-4 |
| Quantization | 4-bit (NF4) |
- CUDA GPU (8GB+ VRAM)
- Python 3.10+
- Ollama (for deployment)