This repository was archived by the owner on Dec 23, 2025. It is now read-only.

Quick Start

Jump to bottom

leafspark edited this page Oct 17, 2024 · 1 revision

Install

Download the latest release for your system on GitHub
Extract to a suitable folder, or use the installer (if using the -setup variant).
Download the .env.example file from here: file
Rename it to .env and move it to the AutoGGUF root folder (where the AutoGGUF-x64 binary is). Open in your choice of text editor, configure as necessary.

Backend

Click Refresh Releases, and select version and build type for your system.
Click Download, and it will be automatically extracted to the llama_bin directory, ready for use.

HuggingFace -> GGUF (unquantized)

Choose model directory with safetensors using Browse
Choose output GGUF filename and path
Select desired output precision for unquantized GGUF
Specify model name (optional, for GGUF metadata), split max size (optional, for splitting models using GGUF split specification automatically, in place)
Click Convert HF to GGUF, will be dropped at the output path

GGUF (unquantized) -> GGUF (quantized)

Select source/unquantized GGUF model in the model directory, or import from any path on the system.
Select quantization types (multiple, if desired).
Set parameters (Allow Requantize essential for quantized -> more quantized GGUF, although this is not recommended)
Click Quantize Model
Quantized model will be dropped in Output Path directory