Manage your local GGUF models without remembering file paths or llama.cpp commands.
GGLib keeps a catalog of your GGUFs, handles downloading from HuggingFace, and starts llama-server for you. Use it from the terminal, a desktop app, a web UI, or as an OpenAI-compatible API — they all share the same database and model directory.
# Download a model from HuggingFace
gglib download bartowski/Qwen2.5-7B-Instruct-GGUF
# List what you have
gglib list
# Start chatting (launches llama-server automatically)
gglib chat qwen2.5
# Serve a model and use it from any OpenAI client
gglib serve qwen2.5
# Pipe anything into a question
cat error.log | gglib question "what went wrong?"
# Or skip the CLI — open the desktop app or web UI
gglib gui
gglib webGGLib treats your local model like a Unix tool. Pipe in any text and ask a question — no API keys, no cloud, no context window gymnastics. Use gglib question (or gglib q for short).
# Code review a PR diff
git diff main | gglib q "review this for bugs and suggest improvements"
# Understand an error log
journalctl -u myapp --since "1 hour ago" | gglib q "what caused this crash?"
# Summarize a man page
man rsync | gglib q "how do I sync only .rs files, excluding target/?"
# Explain unfamiliar config
cat nginx.conf | gglib q "explain the proxy_pass rules"
# Quick code explanation
cat src/main.rs | gglib q "what does this program do?"
# Get a commit message from staged changes
git diff --cached | gglib q "write a concise commit message for these changes"
# Translate a file
cat README_ja.md | gglib q "translate this to English"
# Use {} as a placeholder to control where input goes
echo "segfault at 0x0" | gglib q "I got this error: {}. What does it mean?"
# Read context from a file instead of stdin
gglib q --file Cargo.toml "what dependencies does this project use?"Works with any command that produces text. If you can cat it, you can ask a local model about it.
Cargo workspace with compile-time enforced boundaries. Adapters → infrastructure → core — never the reverse.
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ Core Layer │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ gglib-core │ │
│ │ Pure domain types, ports & traits (no infra deps) │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────┘
│
┌─────────────┬─────────────┬─────┴─────┬─────────────┬─────────────┐
▼ ▼ ▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ Infrastructure Layer │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ gglib-db │ │ gglib-gguf │ │ gglib-mcp │ │ gglib-proxy│ │gglib-voice │ │
│ │ SQLite │ │ GGUF file │ │ MCP │ │ OpenAI- │ │Voice mode │ │
│ │ repos │ │ parser │ │ servers │ │ compat │ │STT/TTS/VAD │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │
│ │
│ ╔═══════════════════════════════════════════════════════════════════════════════╗ │
│ ║ External Gateways ║ │
│ ║ ║ │
│ ║ ┌────────────────────────────────────┐ ┌────────────────────────────────┐ ║ │
│ ║ │ gglib-runtime │ │ gglib-download │ ║ │
│ ║ │ Process lifecycle manager │ │ Download orchestrator │ ║ │
│ ║ │ ONLY component that spawns │ │ ONLY component that contacts │ ║ │
│ ║ │ & manages llama-server │ │ HuggingFace Hub │ ║ │
│ ║ │ │ │ (via gglib-hf + optional │ ║ │
│ ║ │ │ │ hf_xet subprocess) │ ║ │
│ ║ └────────────────────────────────────┘ └────────────────────────────────┘ ║ │
│ ╚═══════════════════════════════════════════════════════════════════════════════╝ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ Facade Layer │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ gglib-gui │ │
│ │ Shared GUI backend (ensures feature parity across adapters) │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ Adapter Layer │
│ │
│ ┌─────────────────────┐ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ gglib-cli │ │ gglib-axum │ │ gglib-tauri │ │
│ │ CLI interface │ │ HTTP server │ │ Desktop application │ │
│ │ (terminal UI) │ │ ┌────────────────┐ │ │ ┌────────────────────┐ │ │
│ │ │ │ │ Serves React │ │ │ │ Embeds React UI │ │ │
│ │ │ │ │ UI (static) │ │ │ │ (WebView assets) │ │ │
│ │ │ │ └────────────────┘ │ │ ├────────────────────┤ │ │
│ │ │ │ │ │ │ Embedded Axum │ │ │
│ │ │ │ │ │ │ (HTTP endpoints) │ │ │
│ │ │ │ │ │ └────────────────────┘ │ │
│ └─────────┬───────────┘ └──────────┬───────────┘ └───────────┬──────────────┘ │
│ │ │ │ │
│ └─────────────────────────┼──────────────────────────┘ │
│ │ │
│ All adapters call infrastructure layer via: │
│ • External Gateways (runtime, download) │
│ • Other infrastructure services (db, gguf, mcp, proxy) │
│ │ │
└───────────────────────────────────────┼─────────────────────────────────────────────┘
│
▼
╔═══════════════════════╗
║ External Gateways ║
║ (from infra layer) ║
╚═══════════════════════╝
│
┌───────────────────┴────────────────────┐
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ gglib-runtime │ │ gglib-download │
│ spawns/manages │ │ calls HF Hub API │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ External Systems │
│ │
│ ┌──────────────────────────────┐ │
│ │ llama-server instances │ │
│ │ (child processes) │ │
│ └──────────────────────────────┘ │
│ │
│ ┌──────────────────────────────┐ │
│ │ HuggingFace Hub API │ │
│ │ (HTTPS endpoints) │ │
│ └──────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────┘
Only gglib-runtime spawns llama-server processes; only gglib-download talks to HuggingFace. Everything else goes through the infrastructure layer.
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-core |
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-agent |
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-db | ||||
| gglib-gguf | ||||
| gglib-hf | ||||
| gglib-download | ||||
| gglib-mcp | ||||
| gglib-proxy | ||||
| gglib-runtime | ||||
| gglib-voice |
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-gui |
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-cli | ||||
| gglib-axum | ||||
| gglib-tauri |
| Crate | Tests | Coverage | LOC | Complexity |
|---|---|---|---|---|
| gglib-build-info |
Each crate has its own README with architecture diagrams, module breakdowns, and design decisions:
| Layer | Crate | Description |
|---|---|---|
| Core | gglib-core | Pure domain types, ports & traits |
| App | gglib-agent | Pure-domain agentic loop (LLM→tool→LLM, port-injected) |
| Infra | gglib-db | SQLite repository implementations |
| Infra | gglib-gguf | GGUF file format parser |
| Infra | gglib-hf | HuggingFace Hub client |
| Infra | gglib-download | Download queue & manager |
| Infra | gglib-mcp | MCP server management |
| Infra | gglib-proxy | OpenAI-compatible proxy server |
| Infra | gglib-runtime | Process manager & system probes |
| Infra | gglib-voice | Voice pipeline (STT/TTS/VAD) |
| Facade | gglib-gui | Shared GUI backend (feature parity) |
| Adapter | gglib-cli | CLI interface |
| Adapter | gglib-axum | HTTP API server |
| Adapter | gglib-tauri | Desktop GUI (Tauri + React) |
| Utility | gglib-build-info | Compile-time version & git metadata |
components– React UI componentscontexts– React Context providershooks– Custom React hookspages– Top-level page componentstypes– Shared TypeScript type definitionsutils– Shared helpers (formatting, SSE, platform detection)services– API client layer (HTTP and Tauri IPC)commands– CLI command reference (download, llama management)
All interfaces share the same database and model directory. Pick whichever fits your workflow — or use several.
| Interface | Launch | Details |
|---|---|---|
| CLI | gglib <command> |
gglib-cli |
| Desktop GUI | gglib gui |
gglib-tauri, src-tauri |
| Web UI | gglib web |
gglib-axum — default 0.0.0.0:9887 |
| OpenAI Proxy | gglib proxy |
gglib-proxy — works with OpenWebUI, any OpenAI SDK |
Security notes
- Web server binds
0.0.0.0(LAN-accessible); proxy binds127.0.0.1(local only) by default - No authentication — designed for trusted networks
- Use firewall rules, private subnets, or VPN; do not expose to the public internet without additional auth
Download from the Releases page:
| Platform | Archive | Post-install |
|---|---|---|
| macOS (Apple Silicon) | gglib-gui-*-aarch64-apple-darwin.tar.gz |
Run macos-install.command to remove quarantine |
| macOS (Intel) | gglib-gui-*-x86_64-apple-darwin.tar.gz |
Same as above |
| Linux | gglib-gui-*-x86_64-unknown-linux-gnu.tar.gz |
Run gglib-gui |
| Windows | gglib-gui-*-x86_64-pc-windows-msvc.zip |
Run gglib-gui.exe |
git clone https://github.com/mmogr/gglib.git && cd gglib
make setup # check deps → build frontend → install CLI → offer llama.cpp installmake setup checks for Rust, Node.js, and build tools; provisions the Miniconda environment for the hf_xet fast download helper; builds the web UI; and installs the CLI to ~/.cargo/bin/. It exits with an error if Python/Miniconda is missing — run it first on new machines.
Developer Mode: When installed via
make setup, the database (gglib.db), config (.env), and llama.cpp binaries live inside your repo folder. Downloaded models default to~/.local/share/llama_models.
- Rust 1.70+ (MSRV). Tooling/CI currently pins Rust 1.91.0 via
rust-toolchain.toml— using that version is recommended. — rustup.rs - Python 3 via Miniconda — miniconda (for hf_xet fast downloads)
- Node.js 20.19+ (matches the
package.jsonenginesfield) — nodejs.org (for web UI) - SQLite 3.x
- Build tools: macOS
xcode-select --install+brew install cmake· Ubuntubuild-essential cmake git· Windows VS 2022 C++ + CMake
llama.cpp is managed by GGLib — no separate install needed.
Makefile targets
Installation & Setup:
make setup— Full setup (dependencies + build + install + llama.cpp)make install— Build and install CLI to~/.cargo/bin/make uninstall— Full cleanup (removes binary, system data, database; preserves models)
Building:
make build/make build-dev— Release / debug binarymake build-gui— Web UI frontendmake build-tauri— Desktop GUImake build-all— Everything (CLI + web UI)
Development:
make test/make check/make fmt/make lint/make doc
llama.cpp:
make llama-install-auto/make llama-status/make llama-update
Running:
make run-gui/make run-web/make run-serve/make run-proxy
Cleaning:
make clean/make clean-gui/make clean-llama/make clean-db
Manual installation (Cargo)
cargo install --path .Configuring the models directory
Default: ~/.local/share/llama_models. Change via any of:
make setupprompt.envfile:GGLIB_MODELS_DIR=/absolute/path- CLI:
gglib config models-dir set <path>orgglib --models-dir <path> download … - GUI/Web: Settings modal (gear icon)
Precedence: CLI flag → env var → default. Changing the directory does not move existing files.
Start the backend and frontend in separate terminals:
# Backend API server
cargo run --package gglib-cli -- web --api-only --port 9887 --base-port 9000
# Frontend dev server (proxies /api/* to backend)
npm run dev
# → http://localhost:5173Or use the VS Code task 🚀 Run Dev (Frontend + Backend) to launch both in parallel.
Port configuration
Set VITE_GGLIB_WEB_PORT in .env to change the API port (default 9887). Both the Rust backend (via clap env) and Vite proxy read this value. The VITE_ prefix is required for Vite. Port config only affects dev mode — production uses same-origin relative paths. Tauri uses dynamic port discovery.
Production builds
npm run build # → ./web_ui/
cargo run --package gglib-cli -- web --port 9887 --static-dir ./web_ui # single-port servingAccelerated downloads (hf_xet)
gglib bundles a managed Python helper for hf_xet fast downloads. On first run (or after make setup / gglib check-deps), it provisions a Miniconda environment under <data_root>/.conda/gglib-hf-xet and installs huggingface_hub>=1.1.5 + hf_xet>=0.6. There is no legacy Rust HTTP fallback — if the helper is missing, gglib download will fail until the environment is repaired.
VS Code tasks
- 🚀 Run Dev (Frontend + Backend) — parallel launch
- 🧠 Run Backend Dev (API-only) — backend only
- 🎨 Run Frontend Dev — Vite dev server
- 🖥️ Run GUI (Dev) — Tauri desktop in dev mode
- 🧪 Run All Tests / 📎 Clippy / 🎨 Format Code
Auto-generated from source and updated with every release.