gguf-runner

gguf-runner is a pure Rust, CPU-first inference runtime for GGUF language models.

The project focuses on:

straightforward local inference
readable code structure
support for multiple model families in one binary

Quick Start

Build:

cargo build --release

Run with a local GGUF file:

cargo run --release -- \
  --model ./Meta-Llama-3-8B-Instruct-Q4_K_M.gguf \
  --prompt "Explain what this project does."

Show all options:

cargo run -- --help

Basic Usage

Required flags:

--model <path>
--prompt <text>

Common optional flags:

--system-prompt <text>
--temperature <float>
--top-k <int>
--top-p <float>
--repeat-penalty <float>
--repeat-last-n <int>
--max-tokens <int>
--context-size <int>
--threads <int>
--show-tokens
--show-timings
--profiling
--debug
--url <model-url> (lazy bootstrap/download path for missing or invalid local file)

What Is Supported

This runtime currently supports multiple model families (Llama, Gemma, Qwen variants), common GGUF quantization types, and platform-specific CPU optimizations.

For detailed feature coverage and platform notes, see:

docs/features.md

For historical benchmark snapshots and performance notes, see:

docs/performance.md

For current module/layout reference, see:

docs/module-structure.md

Project Scope

CPU inference only
GGUF model files only
focus on transparent implementation over broad framework abstraction

Contributing

Issues and pull requests are welcome.

Before opening a PR, run:

cargo fmt --all --check
cargo clippy --all-targets --all-features
cargo check

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.cargo		.cargo
docs		docs
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gguf-runner

Quick Start

Basic Usage

What Is Supported

Project Scope

Contributing

About

Uh oh!

Releases

Packages

Languages

apimeister/gguf-runner

Folders and files

Latest commit

History

Repository files navigation

gguf-runner

Quick Start

Basic Usage

What Is Supported

Project Scope

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages