fix(retrieval): plumb documented [retrieval] tier weights into the additive path (#202) by mbachaud · Pull Request #210 · mbachaud/helix-context

mbachaud · 2026-06-10T22:25:12Z

Closes #202. All 9 tier weights (incl. new sema_boost_weight, default 2.0 - the tier had no knob) now bind in BOTH fusion modes. Zero config-default changes (defaults already equaled the literals); caps scale proportionally with their weight (2x fts5, 2x lex, 3x harmonic, 4x entity - documented per site). Byte-identity proven: golden test captured on pre-fix tree asserts == on scores and per-tier contributions across a 9-tier corpus; the existing 50-query additive snapshot passes unchanged. 23 new tests; ~2,255 passed full sweep in sandbox (4 failures pre-existing on master). Unblocks per-tier weight tuning ahead of the RRF gate (roadmap section 5).

…e additive path (#202) The eight documented [retrieval] tier weights (fts5_weight, splade_weight, tag_exact_weight, tag_prefix_weight, sema_cold_weight, lex_anchor_weight, harmonic_weight, entity_graph_weight) were consumed only via fuser.add_tier(), which the default fusion_mode="additive" never consults -- operators tuning them per the docs saw zero effect. The additive accumulations used inline literals instead. This binds the existing self._*_weight attrs into the additive tier formulas with defaults byte-identical to the old literals. Every config default already equals its tier's leading coefficient, so each substitution swaps in the SAME float value (no scale-factor multiplication, no round-off drift): tier old literal new formula tag_exact match_count * 3.0 match_count * tag_exact_weight tag_prefix match_count * 1.5 match_count * tag_prefix_weight fts5 min(-rank, 6.0) min(-rank, 2.0 * fts5_weight) [no leading coeff in additive -- cap-only knob; cap = 2.0 x weight, default 2.0 x 3.0 == legacy 6.0] splade min(s, 20) * 3.5 / 20 min(s, 20) * splade_weight / 20 sema_boost sim * 2.0 * scale sim * sema_boost_weight * scale [NEW knob, default 2.0 -- the warm Tier-4A boost had no weight knob at all (post-fusion additive in RRF)] sema_cold sim * 3.0 sim * sema_cold_weight lex_anchor min(idf * 1.5, 3.0) min(idf * w, 2.0 * w) [cap = 2.0 x weight, default == legacy 3.0] harmonic +1.0/link, cap 3.0 +w per link, cap 3.0 * w [cap = 3.0 x weight, default == legacy 3.0] entity_graph min(1.0 * 0.5, 2.0) min(1.0 * w, 4.0 * w) [cap = 4.0 x weight, default == legacy 2.0] No existing config default changed. The new sema_boost_weight (2.0) is plumbed through RetrievalConfig, the TOML loader, context_manager's open_read_source kwargs (fans to solo Genome and per-shard Genomes), and the KnowledgeStore ctor. Caps that were independent literals now scale proportionally with their tier's weight (documented inline, in helix.toml and docs/config-reference.md): zeroing a weight kills its tier; scaling it scales the tier's contribution including the capped region. Tests: tests/test_additive_weight_plumbing.py - golden byte-identity: 10-doc corpus firing all 9 tiers; final scores AND per-tier contributions captured on the pre-fix tree (266e9aa) and asserted bit-identical (==) post-fix, plus an explicit-defaults == implicit-defaults run - per-knob "knob moves exactly its tier" and "zero weight kills the tier" coverage for all 9 knobs - RetrievalConfig defaults == legacy additive literals; TOML loader plumbs sema_boost_weight - the existing 50-query additive back-compat snapshot (test_fusion_rrf.py::test_fusion_mode_additive_unchanged) passes unchanged

mbachaud merged commit 673cdfd into master Jun 10, 2026
3 checks passed

mbachaud deleted the fix/202-additive-weights branch June 10, 2026 23:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(retrieval): plumb documented [retrieval] tier weights into the additive path (#202)#210

fix(retrieval): plumb documented [retrieval] tier weights into the additive path (#202)#210
mbachaud merged 1 commit into
masterfrom
fix/202-additive-weights

mbachaud commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mbachaud commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant