Share sensitivity data artifacts across same model quantizations

Let's say I want to quantize a model all the way from oQ2 to oQ8. In my current understanding of the oQ process at least _some parts_ of the work are going to be repeated over and over, given that it's the same sensitivity model all the time. Like sensitivity ratio of the same tensors against the same quantization bits.

This would help me a lot with a massive benchmarking work.

Feel free to close this request if my assumption is incorrect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share sensitivity data artifacts across same model quantizations #972

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Share sensitivity data artifacts across same model quantizations #972

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions