Add support for int8 quantization backend by silveroxides · Pull Request #37 · Comfy-Org/comfy-kitchen

silveroxides · 2026-05-07T22:30:53Z

This draft adds support for int8 quantized models with full backend support and with optimized matmul kernels using triton.

…w per_channel flag. Adds dedicated triton per-row dequant kernel and fixes scale broadcasting in eager and CUDA backends.

silveroxides added 3 commits May 7, 2026 19:03

Add support for int8 tensorwise quantization with backend and layouts

954a4aa

Add functioning cuBLASLt backend for int8 that passes tests.

60a3b2f

Add per-channel weight scaling support to TensorWiseINT8Layout via ne…

750cbd6

…w per_channel flag. Adds dedicated triton per-row dequant kernel and fixes scale broadcasting in eager and CUDA backends.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for int8 quantization backend#37

Add support for int8 quantization backend#37
silveroxides wants to merge 3 commits into
Comfy-Org:mainfrom
silveroxides:feature/int8-tensorwise

silveroxides commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

silveroxides commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant