This repository was archived by the owner on Sep 28, 2025. It is now read-only.
Releases: tangledgroup/llama-cpp-cffi
Releases · tangledgroup/llama-cpp-cffi
v0.4.18
v0.4.17
v0.3.1
Added:
- llama-cpp-cffi server - support for dynamic load/unload of model - hot-swap of models on demand
- llama-cpp-cffi server - compatible with llama.cpp cli options
- llama-cpp-cffi server - limited compatibility for OpenAI API
/v1/chat/completionsfor text and vision models - Support for
CompletionsOptions.messagesfor VLM prompts with a single message containing just a pair oftextandimage_urlincontent.
Changed:
llama.cpprevision0827b2c1da299805288abbd556d869318f2b121e
v0.3.0
Added:
- Qwen 2 VL 2B / 7B vision models support
- WIP llama-cpp-cffi server - compatible with llama.cpp cli options instead of OpenAI
Changed:
llama.cpprevision5896c65232c7dc87d78426956b16f63fbf58dcf6- Refactored
Optionsclass into two separate classes:ModelOptions,CompletionsOptions
Fixed:
- Llava (moondream2, nanoLLaVA-1.5, llava-v1.6-mistral-7b) vision models support
- MiniCPM-V 2.5 / 2.6 vision models support
Removed:
- Removed ambiguous
Optionsclass
v0.2.0
Added:
- New high-level Python API
- Low-level C API calls from llama.h, llava.h, clip.h, ggml.h
completionsfor high-level function for LLMs / VLMstext_completionsfor low-level function for LLMsclip_completionsfor low-level function for CLIP-based VLMs- WIP:
mllama_completionsfor low-level function for Mllama-based VLMs
Changed:
- All examples
Removed:
llama_generatefunctionllama_cpp_clillava_cpp_climinicpmv_cpp_cli