Skip to content

Releases: AmSach/kvquant

v0.1.0 - Initial Release

30 Apr 11:50

Choose a tag to compare

KVQuant - Run 70B LLMs on 8GB RAM with real-time KV cache compression. Features: 4-10x compression, <100ms latency, drop-in HuggingFace integration.