Skip to content

Releases: mikjee/warpdrv

v0.4.8-alpha

08 May 19:50
c4dc124

Choose a tag to compare

warpdrv v0.4.8-alpha

Whats New

  • Elicitation UI for chat via MCP tools
  • Tool Renderers are now auto-matched by tool name and params to render tool-specific UI in chat, such as DiffView for file editing
  • Charts and Graphs can now be rendered by LLM response (via mermaid) in markdown.

Fixes

  • Fixes an issue where multiple MCP server were connected serially, prolonging app start-up time.

v0.4.7-alpha

07 May 10:50
af09db9

Choose a tag to compare

warpdrv v0.4.7-alpha

Whats New

  • Themes
  • Tool Renderers for commonly used tools
  • mcp.json fields for tool renderer adapter

Fixes

  • Fixes Issue #8 Ports not assigned properly when auto-assign is set leading to failure to launch server
  • Fix an issue where multiple chained tool calls, when approved one-by-one would cause multiple inference triggers.

v0.4.5-alpha

03 May 17:58

Choose a tag to compare

warpdrv v0.4.5-alpha

Multiple Critical Fixes

  • Fixes Issue #5 - now -device flag is not sent when multi-gpu is toggled on.
  • Fixes Issue #7 - model scanner properly detects GGUF files within nested directories such as quant-labeled sub-directories with repos
  • Fixes an issue where the update banner won't show when a new release is created.
  • Fixes a critical issue where full layer offload does not offload output layers causing degraded performance - now -ngl 999 flag is sent for full offload.
  • Fixes certain UI elements within Hubs page and the page header.

v0.4.4-alpha

02 May 21:11
72cb687

Choose a tag to compare

warpdrv v0.4.4-alpha

Fixes

  • Fixes Issue #3 about -ngl 999 flag being sent to llama-server despite being overridden in the server edit UI by offloading slider.
  • Fixes a critical issue where maingpu field for multi-split, and device field for single-gpu are out of sync.

v0.4.3-alpha

01 May 22:03

Choose a tag to compare

warpdrv v0.4.3-alpha

What's new

Home page. New landing page with an overview of warpdrv and onboarding steps to get a server running from a fresh install.

Documentation. First batch of how-to guides shipped under docs/guides/:

  • Recipes — automating llama.cpp builds and other LLM-related bash tasks
  • Aliases — routing addresses for servers behind the OpenAI-compatible proxy
  • Backend Groups — swapping llama.cpp builds without re-configuring servers
  • Proxy, Remote Access, and Authentication — direct vs proxied access, bearer tokens, accessing warpdrv from another machine
  • KV Cache Checkpoints — saving and restoring slot state to skip prompt prefill

Fixes

  • Several UI bugs fixed.

Update

  • Updated binaries to open links externally.

Initial - Pre-release v0.4.2-alpha

30 Apr 13:54

Choose a tag to compare

Pre-release

Warpdrv v0.4.2 — Initial Pre-release

Warpdrv is a desktop app for running and managing local LLM inference via
llama.cpp. Purpose-built for testing cutting-edge models and custom backend
builds.

Tauri + React frontend, Node/Express backend, SQLite storage. Linux x64
for now (built and tested on Ubuntu 25.10).

Features

  • Server management. Launch llama-server with full parameter control —
    GPU layers, context size, batch/ubatch, flash attention, KV cache quant,
    device selection, direct I/O, and any custom flag. Per-model param
    overrides save the right settings for each model.

  • Backends. Register multiple llama.cpp builds (CUDA, ROCm, Vulkan, custom)
    and group them for quick swapping. Recipe Engine compiles fresh builds on
    demand from shared bash recipes — run new models the day they drop,
    without waiting for bundled binaries.

  • KV cache checkpoints. Save and restore cache state per thread.

  • Proxy. OpenAI-compatible endpoint with server-alias routing. Point any
    chat app at WarpCore and route to whichever backend you need. Optional auth.

  • Models. Recursive GGUF scan with header parsing (architecture, params,
    quant, context, vocab). Multi-shard and mmproj detection. HF Hub browser
    and download manager.

  • Chat. Threads, folders, full sampling controls, inference-time presets,
    live streaming with reasoning blocks. MCP server integration with per-tool
    permission prompts. Switch servers mid-thread.

Known limitations

  • Recipe scripts do not work on Windows.
  • Linux & Windows x64 only this release.

Install

Find installers for Ubuntu and WIndows in release artifacts.
Issues and PRs welcome.