Releases: brontoguana/krasis
v0.1.66-rc7
Pre-release candidate with native HQQ attention modes and release CI sidecar packaging fixes.
v0.1.66-rc6
Pre-release candidate with native HQQ attention modes, HQQ8 launcher default, HQQ4SC installer option, llama-witness sequence validation support, and release CI fixes for CUDA sidecars and Blackwell FLA sidecars.
v0.1.66-rc5
Pre-release candidate with native HQQ attention modes, HQQ8 launcher default, HQQ4SC installer option, llama-witness sequence validation support, and release CI fixes for CUDA sidecars and Blackwell FLA sidecars.
v0.1.66-rc4
Pre-release candidate with native HQQ attention modes, HQQ8 launcher default, HQQ4SC installer option, llama-witness sequence validation support, and release CI sidecar packaging fixes.
v0.1.66-rc3
Pre-release candidate with native HQQ attention modes, HQQ8 launcher default, HQQ4SC installer option, llama-witness sequence validation support, and release CI CUDA driver-stub linking fixes.
v0.1.66-rc2
Recreated rc2 on commit 418e3ce after switching the manylinux FLA link step to the resolved CUDA stub file path directly.
v0.1.66-rc1
Pre-release for multi-GPU testing.
Changes since v0.1.65-rc6:
- 122B FLA fix: multi-H cubins and scratch buffer sizing
- Cross-compiled FLA kernels for sm80/sm89/sm90/sm120
- FLA kernel arg signature and block size fix
- Arch-specific FLA .so files ship in wheel (no first-run JIT)
- GPU arch auto-detection with forward/backward compat fallback
v0.1.65-rc6
Prerelease for installed-package sidecar fixes and FP8-only KV cache on Ampere.
v0.1.65-rc5
Prerelease with vendored CUDA sidecars injected into release wheels, prerelease installer force-reinstall handling, and FP8-only KV cache on Ampere and in the interactive launcher.
v0.1.65-rc4
Prerelease with release-wheel sidecar injection, prerelease installer force-reinstall handling, and FP8-only KV cache on Ampere and in the interactive launcher.