This repository was archived by the owner on Dec 23, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 19
Quick Start
leafspark edited this page Oct 17, 2024
·
1 revision
- Download the latest release for your system on GitHub
- Extract to a suitable folder, or use the installer (if using the -setup variant).
- Download the .env.example file from here: file
- Rename it to .env and move it to the AutoGGUF root folder (where the AutoGGUF-x64 binary is). Open in your choice of text editor, configure as necessary.

- Click Refresh Releases, and select version and build type for your system.
- Click Download, and it will be automatically extracted to the
llama_bindirectory, ready for use.

- Choose model directory with safetensors using Browse
- Choose output GGUF filename and path
- Select desired output precision for unquantized GGUF
- Specify model name (optional, for GGUF metadata), split max size (optional, for splitting models using GGUF split specification automatically, in place)
- Click Convert HF to GGUF, will be dropped at the output path

- Select source/unquantized GGUF model in the model directory, or import from any path on the system.
- Select quantization types (multiple, if desired).
- Set parameters (Allow Requantize essential for quantized -> more quantized GGUF, although this is not recommended)
- Click Quantize Model
- Quantized model will be dropped in Output Path directory