Skip to content

Request: publish TriAttention calibration script (TRIA v2/v3 stats.bin producer) #10

@minhtcai

Description

@minhtcai

Summary

The feature/triattention-scoring branch wires up the --triattention <path> flag and ships a complete loader (src/triattention.c::tria_load, magic 0x54524941, versions 1/2/3 supported), but the repository does not include a calibration script that produces the stats.bin file the loader expects.

The README/TURBOQUANT.md reports results like "Qwen3.5-27B turbo3 + TriAtt 75%" — so a calibration pipeline clearly exists internally. Could it be published?

What's documented

src/triattention-integration.md describes the algorithm (CPU-side scoring every 128 tokens, GQA aggregation, z-norm, top-K). The header documents the on-disk schema (layer count, kv-head count, fc, head_dim, optional layer_budget_scales, per-head MRL and non-rot data). Sufficient to reverse-engineer the format, but not the calibration math.

Why it matters

Without the producer, third parties (including users on non-AMD hardware trying to validate the approach — sm_75 / sm_72 in our case) can build the binary but not exercise the TriAttention path end-to-end. We end up only able to compare TurboQuant variants.

Asks (in order of preference)

  1. Publish the script as-is (even if it's research-grade / undocumented).
  2. Document the math sufficient for a third-party reimplementation: how Q/K concentration centers are computed, what calibration corpus you used, MRL bin definitions, what nonrot_alpha modulates.
  3. Confirm format details for a community reimplementation: byte order, per-head record layout, how layer_budget_scales is computed, version differences (v1 vs v2 vs v3).

Context

Trying to evaluate feasibility of the TriAttention + TurboQuant path on NVIDIA Volta-class hardware (Jetson Xavier AGX, sm_72) for an LLM cluster. Happy to share results back if we get it working on Turing/Volta.

Thanks for the work — the turbo3 + TriAtt numbers are striking.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions