Request: publish TriAttention calibration script (TRIA v2/v3 stats.bin producer)

## Summary

The `feature/triattention-scoring` branch wires up the `--triattention <path>` flag and ships a complete loader (`src/triattention.c::tria_load`, magic `0x54524941`, versions 1/2/3 supported), but the repository does not include a calibration script that produces the `stats.bin` file the loader expects.

The README/TURBOQUANT.md reports results like "Qwen3.5-27B turbo3 + TriAtt 75%" — so a calibration pipeline clearly exists internally. Could it be published?

## What's documented

`src/triattention-integration.md` describes the algorithm (CPU-side scoring every 128 tokens, GQA aggregation, z-norm, top-K). The header documents the on-disk schema (layer count, kv-head count, fc, head_dim, optional layer_budget_scales, per-head MRL and non-rot data). Sufficient to reverse-engineer the format, but not the calibration math.

## Why it matters

Without the producer, third parties (including users on non-AMD hardware trying to validate the approach — sm_75 / sm_72 in our case) can build the binary but not exercise the TriAttention path end-to-end. We end up only able to compare TurboQuant variants.

## Asks (in order of preference)

1. **Publish the script as-is** (even if it's research-grade / undocumented).
2. **Document the math** sufficient for a third-party reimplementation: how Q/K concentration centers are computed, what calibration corpus you used, MRL bin definitions, what `nonrot_alpha` modulates.
3. **Confirm format details** for a community reimplementation: byte order, per-head record layout, how `layer_budget_scales` is computed, version differences (v1 vs v2 vs v3).

## Context

Trying to evaluate feasibility of the TriAttention + TurboQuant path on NVIDIA Volta-class hardware (Jetson Xavier AGX, sm_72) for an LLM cluster. Happy to share results back if we get it working on Turing/Volta.

Thanks for the work — the turbo3 + TriAtt numbers are striking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: publish TriAttention calibration script (TRIA v2/v3 stats.bin producer) #10

Summary

What's documented

Why it matters

Asks (in order of preference)

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Request: publish TriAttention calibration script (TRIA v2/v3 stats.bin producer) #10

Description

Summary

What's documented

Why it matters

Asks (in order of preference)

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions