[Feature Request] New prompt upscaler: xllamacpp

As [hoped](https://github.com/Teriks/dgenerate/discussions/53#discussioncomment-14544399), xllamacpp binaries now available for non-CUDA GPUs, providing a more modern alternative to GPT4all.

The relevant indexes are:

| Device Type | xllamacpp Build | Index |
|-------------------|-------------------------|----------|
| NVIDIA | CUDA | https://xorbitsai.github.io/xllamacpp/whl/cu128 |
| AMD | Vulkan* | https://xorbitsai.github.io/xllamacpp/whl/vulkan |
| Intel (XPU) | Vulkan | https://xorbitsai.github.io/xllamacpp/whl/vulkan |
| Apple Silicon | Metal | Default (PyPI) |

_\* ROCm builds are available for Linux but are not as performant (see [xllamacpp#61 (comment)](https://github.com/xorbitsai/xllamacpp/issues/61#issuecomment-3342111587))_

XLllamaCPP inference is normally handled by [Xorbits Inference](https://github.com/xorbitsai/inference/blob/7c0108c4df67da92e11f88213432eb95b9cf94dc/xinference/model/llm/llama_cpp/core.py). However, this is a [very dependency-heavy package](https://github.com/xorbitsai/inference/blob/7c0108c4df67da92e11f88213432eb95b9cf94dc/setup.cfg#L27) and as a result you may wish to consider utilizing the [basic test code](https://github.com/xorbitsai/xllamacpp/blob/b0ae068415293625d97909a74efe661123b2afa7/tests/test_server.py) as a starter template. Note that as of 0.2.6 xllamacpp also [supports](https://github.com/xorbitsai/xllamacpp/pull/96) structured JSON outputs.

_Update 12/1:_ support for structured JSON outputs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] New prompt upscaler: xllamacpp #73

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Device Type	xllamacpp Build	Index
NVIDIA	CUDA	https://xorbitsai.github.io/xllamacpp/whl/cu128
AMD	Vulkan*	https://xorbitsai.github.io/xllamacpp/whl/vulkan
Intel (XPU)	Vulkan	https://xorbitsai.github.io/xllamacpp/whl/vulkan
Apple Silicon	Metal	Default (PyPI)

Uh oh!

[Feature Request] New prompt upscaler: xllamacpp #73

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions