easy-llama wishlist

I'm interested in potentially replacing llama-cpp-python with easy-llama in my project, and have some questions about feature parity:

1. Are all parameters in the `params` dict below available?

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_model.py#L88

2. Is it possible to get the logits after a certain input? As done here:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_model.py#L134

3. Similar to 2. but more nuanced: is there a way to get the logits for *every* token position in an input at once? In llama-cpp-python, this is done by passing `logits_all=True` while loading the model, which reduces performance but makes all logits available as a matrix when you get them with `model.eval_logits`. I have used this feature to measure the perplexity of llama.cpp quants a while ago using the code here:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_hf.py#L133

4. I have a llamacpp_HF wrapper that connects llama.cpp to HF text generation functions; at its core, all it does is update `model.n_tokens` to do prefix matching, and evaluate new tokens by calling `model.eval` taking as input a list containing the new tokens only. Can that be done with easy-llama? See:

https://github.com/oobabooga/text-generation-webui/blob/096272f49e55357a364ed9016357b97829dae0fd/modules/llamacpp_hf.py#L118

5. Is speculative decoding implemented? There is a PR here https://github.com/oobabooga/text-generation-webui/pull/6669/files to add it, and having it in easy-llama would be great, especially if it could be done in a simple way by just passing new kwargs to its model loading and/or generation functions. I believe doing that for my llamacpp_HF wrapper would be very hard, so that's not something I have hopes for.

If you are interested, a PR changing llama-cpp-python to easy-llama in my repository would be highly welcome once wheels are available. It would be a way to test the library as well. But I can also to try to do the change myself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

easy-llama wishlist #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

easy-llama wishlist #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions