Let's say I want to quantize a model all the way from oQ2 to oQ8. In my current understanding of the oQ process at least some parts of the work are going to be repeated over and over, given that it's the same sensitivity model all the time. Like sensitivity ratio of the same tensors against the same quantization bits.
This would help me a lot with a massive benchmarking work.
Feel free to close this request if my assumption is incorrect.
Let's say I want to quantize a model all the way from oQ2 to oQ8. In my current understanding of the oQ process at least some parts of the work are going to be repeated over and over, given that it's the same sensitivity model all the time. Like sensitivity ratio of the same tensors against the same quantization bits.
This would help me a lot with a massive benchmarking work.
Feel free to close this request if my assumption is incorrect.