Releases · AmSach/kvquant · GitHub

30 Apr 11:50

AmSach

v0.1.0 - Initial Release Latest

Latest

KVQuant - Run 70B LLMs on 8GB RAM with real-time KV cache compression. Features: 4-10x compression, <100ms latency, drop-in HuggingFace integration.

Assets 2