Skip to content

Share sensitivity data artifacts across same model quantizations #972

@deepsweet

Description

@deepsweet

Let's say I want to quantize a model all the way from oQ2 to oQ8. In my current understanding of the oQ process at least some parts of the work are going to be repeated over and over, given that it's the same sensitivity model all the time. Like sensitivity ratio of the same tensors against the same quantization bits.

This would help me a lot with a massive benchmarking work.

Feel free to close this request if my assumption is incorrect.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions