Skip to content
This repository was archived by the owner on Dec 23, 2025. It is now read-only.

Quick Start

leafspark edited this page Oct 17, 2024 · 1 revision

Install

  1. Download the latest release for your system on GitHub
  2. Extract to a suitable folder, or use the installer (if using the -setup variant).
  3. Download the .env.example file from here: file
  4. Rename it to .env and move it to the AutoGGUF root folder (where the AutoGGUF-x64 binary is). Open in your choice of text editor, configure as necessary.

Backend

image

  1. Click Refresh Releases, and select version and build type for your system.
  2. Click Download, and it will be automatically extracted to the llama_bin directory, ready for use.

HuggingFace -> GGUF (unquantized)

image

  1. Choose model directory with safetensors using Browse
  2. Choose output GGUF filename and path
  3. Select desired output precision for unquantized GGUF
  4. Specify model name (optional, for GGUF metadata), split max size (optional, for splitting models using GGUF split specification automatically, in place)
  5. Click Convert HF to GGUF, will be dropped at the output path

GGUF (unquantized) -> GGUF (quantized)

image

  1. Select source/unquantized GGUF model in the model directory, or import from any path on the system.
  2. Select quantization types (multiple, if desired).
  3. Set parameters (Allow Requantize essential for quantized -> more quantized GGUF, although this is not recommended)
  4. Click Quantize Model
  5. Quantized model will be dropped in Output Path directory

Clone this wiki locally