Releases · tangledgroup/llama-cpp-cffi · GitHub

This repository was archived by the owner on Sep 28, 2025. It is now read-only.

30 Jan 18:57

mtasic85

v0.4.18 Latest

Latest

Changed:

llama.cpp revision eb7cf15a808d4d7a71eef89cc6a9b96fe82989dc

Assets 11

23 Jan 13:55

mtasic85

v0.4.17

Changed:

llama.cpp revision 6152129d05870cb38162c422c6ba80434e021e9f

Fixed:

Fixed build process, json patches.
Reverted server code to previous version due to bug.

Assets 11

02 Jan 11:41

mtasic85

v0.3.1

Added:

llama-cpp-cffi server - support for dynamic load/unload of model - hot-swap of models on demand
llama-cpp-cffi server - compatible with llama.cpp cli options
llama-cpp-cffi server - limited compatibility for OpenAI API /v1/chat/completions for text and vision models
Support for CompletionsOptions.messages for VLM prompts with a single message containing just a pair of text and image_url in content.

Changed:

llama.cpp revision 0827b2c1da299805288abbd556d869318f2b121e

Assets 20

01 Jan 10:49

mtasic85

v0.3.0

Added:

Qwen 2 VL 2B / 7B vision models support
WIP llama-cpp-cffi server - compatible with llama.cpp cli options instead of OpenAI

Changed:

llama.cpp revision 5896c65232c7dc87d78426956b16f63fbf58dcf6
Refactored Options class into two separate classes: ModelOptions, CompletionsOptions

Fixed:

Llava (moondream2, nanoLLaVA-1.5, llava-v1.6-mistral-7b) vision models support
MiniCPM-V 2.5 / 2.6 vision models support

Removed:

Removed ambiguous Options class

Assets 20

11 Dec 10:58

mtasic85

v0.2.0

Added:

New high-level Python API
Low-level C API calls from llama.h, llava.h, clip.h, ggml.h
completions for high-level function for LLMs / VLMs
text_completions for low-level function for LLMs
clip_completions for low-level function for CLIP-based VLMs
WIP: mllama_completions for low-level function for Mllama-based VLMs

Changed:

All examples

Removed:

llama_generate function
llama_cpp_cli
llava_cpp_cli
minicpmv_cpp_cli

Assets 4

27 Nov 08:27

mtasic85

v0.1.22

Added:

llava high-level API calls
minicpmv high-level API support

Assets 20

02 Sep 06:38

mtasic85

v0.1.16

Added:
- Updated llama.cpp.

Assets 20

20 Aug 06:56

mtasic85

v0.1.15

Added:
- SmolLM-1.7B-Instruct-v0.2 examples.
- Updated llama.cpp.

Assets 20

17 Aug 06:50

mtasic85

v0.1.14

Fixed:
- Vulkan detection.

Assets 20

16 Aug 20:05

mtasic85

0.1.13

Fixed:
- CUDA and Vulkan detection.

Assets 20