As hoped, xllamacpp binaries now available for non-CUDA GPUs, providing a more modern alternative to GPT4all.
The relevant indexes are:
* ROCm builds are available for Linux but are not as performant (see xllamacpp#61 (comment))
XLllamaCPP inference is normally handled by Xorbits Inference. However, this is a very dependency-heavy package and as a result you may wish to consider utilizing the basic test code as a starter template. Note that as of 0.2.6 xllamacpp also supports structured JSON outputs.
Update 12/1: support for structured JSON outputs
As hoped, xllamacpp binaries now available for non-CUDA GPUs, providing a more modern alternative to GPT4all.
The relevant indexes are:
* ROCm builds are available for Linux but are not as performant (see xllamacpp#61 (comment))
XLllamaCPP inference is normally handled by Xorbits Inference. However, this is a very dependency-heavy package and as a result you may wish to consider utilizing the basic test code as a starter template. Note that as of 0.2.6 xllamacpp also supports structured JSON outputs.
Update 12/1: support for structured JSON outputs