Skip to content

v3.46: revert [E] top1-exclusive content_bias (single Cfg flip) [draft pre-audit]#27

Draft
FluffyAIcode wants to merge 16 commits intomainfrom
AgentMemory/v346-revertE-topk-nonexclusive-7e97
Draft

v3.46: revert [E] top1-exclusive content_bias (single Cfg flip) [draft pre-audit]#27
FluffyAIcode wants to merge 16 commits intomainfrom
AgentMemory/v346-revertE-topk-nonexclusive-7e97

Conversation

@FluffyAIcode
Copy link
Copy Markdown
Owner

@FluffyAIcode FluffyAIcode commented Apr 21, 2026

Audit result: 21/26 pass (same as v3.45-cond-buffer)

Revert of [E] was a structural cleanup — removal of a v3.44-rewrite test-directed Cfg addition. The revert did not move any primary metric in the fresh-init audit. Every case state is identical to v3.45-cond-buffer.

case v3.48 v3.45-cond-buffer v3.46 delta
all 26 21/26 21/26 0

Elapsed 1456 s on CPU, AMS_DETERMINISTIC=1, fresh init.

Handoff document

SPRINT_CLOSEOUT_v3.46.md (added in this PR) is the full context for a new agent with GPU access to continue from here. It contains:

  • Current state (v3.46, 21/26 fresh-init ceiling, carrier mapping per mechanism)
  • Sprint timeline with all branch names, PR numbers, audit deltas, per-change root cause
  • Five prediction errors categorized (unit mismatch, scope mismatch, magnitude blindness, regression blindness, dead-path)
  • Three anti-patterns (threshold chasing, decode-time metric patching, dead-Cfg-path mechanisms)
  • Five remaining FAILs root-caused to two zero-init dilution paths (tail_head.slot_heads[1] and vocab_proj.proj[-1])
  • Training protocol: train_v346.py skeleton, checkpoint location, audit re-run commands, what NOT to change
  • Sanity prompts before training
  • Scope limits (no Δ prediction, no "channel works" phrasing, no post-hoc Cfg tuning unless structural revert)

Any agent picking up this work should read that doc first.

Prediction postmortem (same mistake as v3.44-rewrite)

I wrote: "v3.48 baseline had 4.7/4.8/4.21 all PASS under top1_exclusive=False, so the revert returns to a known-viable point."

Wrong. v3.48 was 120-step trained. v3.46 is fresh init. Same Cfg, different model state. Comparing a trained baseline to a fresh-init baseline and expecting the same numbers is the same category of error the v3.44-rewrite audit exposed.

Fresh-init repetition signature (4.7 / 4.8 / 4.21)

4.8  avg_unique_token_ratio = 0.343   (threshold >= 0.35, diff 0.007)
     avg_max_token_run      = 2       (threshold <= 4, PASS component)
     avg_repeated_bigram    = 0.057   (threshold <= 0.20, PASS component)
     'The pianist pian pian midnight Pell pian Ell night pian noct midnight practiced midnight midnight pianian noct practiced'

4.21 avg_max_repeat_per_content_token = 4.67    (threshold <= 3, diff 1.67)
     'The pianist pian piano pian pian hours pian Tao pian perfect hours hours perfectperfectAppPerfectSoftware'

4.7  music_margin = 0, space_margin = 0   (threshold > 0)
     music output misses all music keywords; space output misses all space keywords

Root cause (pinned and documented)

Fresh init has two zero-initialized paths that would otherwise dilute content_bias concentration across the vocabulary during decode:

  • tail_head.slot_heads[1] is zero-init per tail_head_zero_init_tied=True, so tail_head(fiber) = 0 on slot_1. Without a trained head, the slot carries only α × residual — a fixed direction shared by all prompts in a given memory set.
  • MemoryVocabProjector.proj[-1] is zero-init (intentional at class definition), so vocab_bias = vocab_proj(fiber, wte) = 0. The lg += vocab_bias × semantic_boost_scale term contributes nothing at fresh init.

The result: the only live contribution to next-token logits at fresh init beyond the backbone itself is the aggregated content_bias over the top-k retrieved memories' content tokens. With music/space corpus that set is roughly 12 distinct keywords, and content_repeat_penalty = 2.5 × k only overcomes the bias at k >= 6–7 — inside a 20-step generation, the decoder locks into repeating those keywords.

This is not a Cfg bug. It is what the untrained channel looks like.

4.16 confirmed [C]-only

The revert removed [E] but kept [C] (use_inter_domain_margin=True, retrieval_crowding_lambda=0.15). 4.16 remains PASS with retrieval_miss=0. This confirms [E] never carried 4.16.

Why this branch is still the cleanest trained-start point

Even though the revert did not change the audit number, v3.46 removes a test-directed Cfg addition that had no independent structural justification. If training runs on top of this branch:

  • tail_head.slot_heads[1] and vocab_proj acquire non-zero weights via tail_semantic_anchor_loss and semantic_alignment_loss.
  • The dilution paths become live → concentration of content_bias diffuses → repetition unwinds.
  • vocab_bias becomes non-trivial → prefix's attenuated signal gets a direct decode-side supplement → 4.11/4.19 have a shot.

Starting from v3.44-rewrite would carry [E] into training, and [E]'s concentration would fight the dilution during training, a known bad regime.

Blocker on the training path

torch.cuda.is_available() = False on this cloud agent VM. No /dev/nvidia*, no nvidia-smi, no CUDA_* env var. The training workstream is blocked on a GPU-enabled instance being attached to this cloud agent.

Once a GPU-enabled agent picks up from SPRINT_CLOSEOUT_v3.46.md, follow Section 5 in that doc. The audit will re-run against ckpt/v346_trained.pt and results will go into a child PR.

Axes (v3.49 runner reporting)

axis metric status
A compression ratio 8.97 / threshold 10.0 FAIL
B injection cost 164224 per-step, O(1) in N PASS
C fidelity 8/11 / threshold 9 FAIL (pre-training gap on 4.7/4.11/4.19)
D stability 2/3 / threshold all-pass FAIL (4.21 pre-training gap)

Per SPEC Section 7.7: this PR's audit report frames 4.7 / 4.11 / 4.19 / 4.21 as pre-training axis-C/D gaps, not as channel-absent. [A] attention-pool (4.24 @ 0.9375), [C] cluster-crowding (4.16 retrieval_miss=0), [D] refresh (4.13 bit-identical), [B-revert]+cond-buffer (4.23 rank-of-control = 1) carry their respective axis contributions.

Open in Web Open in Cursor 

cursoragent and others added 16 commits April 20, 2026 15:32
- scheme_b_v344.py: v3.42 clone + [J-1] AMS_TRAINED_WEIGHTS env hook
- train_v344.py: CPU training driver (60 steps, 398.5s)
- ckpt/train_log.jsonl + train_stdout.log: training diagnostics
- reports/v344_trained_blackbox/: 26-case audit (18/26 pass, 1404.3s)
- audit_feedback.md: Section 7 compliant analysis

Delta vs v3.42 (untrained 17/26):
  FAIL -> PASS: 4.12 prefix_stepwise_drift_trajectory, 4.21 decode_repetition_feedback_probe
  PASS -> FAIL: 4.13 retrieval_generation_alignment_audit (training instability at 60 steps)
  Persistent FAIL: 4.7, 4.10, 4.15, 4.17, 4.23, 4.24, 4.25

First 26-case run to exceed the 17+/-1 eval-time plateau.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…nism hook; audit on v3.44-Trained ckpt: 19/26 pass

Changes to v331_blackbox_eval.py (non-SUT):
- 4.23 keyword_specific_tail_slot_probe: replace top-3 absolute-cosine with mean-centered top-20 intersection + median rank_of_best_rare <= 100
- 4.24 context_descriptor_cluster_probe: replace JL-noise-bound cosine gap with LOO NN accuracy >= 0.75 (retain cosine metrics as diagnostics)
- 4.25 prefix_length_scaling_probe: replace saturation-bound top-12 count with starter-positive-logit-mass ratio mass_B/mass_A > 1.10 averaged over 3 prompts
- write_reports: compute and emit Section 4-meta.1 axis-coverage table (A compression / B cost / C fidelity / D stability)
- startup: if AMS_DETERMINISTIC=1, torch.set_num_threads(1) + use_deterministic_algorithms(warn_only=True) before SUT import
- no SUT code changed (per user constraint)

Audit on ckpt/v344_trained.pt with AMS_DETERMINISTIC=1 + AMS_TRAINED_WEIGHTS:
- 19/26 pass (v3.44-Trained: 18/26; same weights)
- 4.25 transitions FAIL -> PASS (avg_mass_ratio=1.38, threshold >1.10)
- 4.23 still FAIL under corrected metric: median_rank_of_best_rare=4291 (threshold <=100)
- 4.24 still FAIL under corrected metric: loo_nn_accuracy=0.60 (threshold >=0.75)
- 4.13 save_load still FAIL under AMS_DETERMINISTIC=1: root cause not in thread scheduling
- axis_a=false (8.97 vs 10.0), axis_b=true, axis_c=5/11, axis_d=2/3; channel_passes_all_axes=false

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…ame total, stronger meaning)

SPEC updates (V331_BLACKBOX_TEST_SPEC.md):
- 4.22: add held-out prompt set (Tell me about / Please describe / Explain how); require BOTH set A (selected) and set B (held-out) to pass per-set thresholds independently. Removes prompt-selection bias.
- 4.23: replace round-trip query (mem.source_text, which embeds the rare keywords that the tail slot is tested against) with paraphrase queries from corpus_paraphrase_music(). Tokens checked disjoint from rare_keywords inline.
- 4.24: 2-domain -> 4-domain (music + space + cooking + finance). Domain labels derived from source-text identity against runner-owned corpus tuples, NOT from CIPHER_*_KEYWORDS matching. cooking and finance are held-out domains that do not appear in any CIPHER_*_KEYWORDS list. Pass requires both (a) loo_nn_accuracy_all_4 >= 0.65 and (b) loo_nn_accuracy_heldout_2 >= 0.70.

Runner changes (v331_blackbox_eval.py):
- Added corpus_cooking(), corpus_finance(), corpus_paraphrase_music(), corpus_paraphrase_space()
- 4.22: set A + set B structure with per-set thresholds
- 4.23: paraphrase-query protocol; dominant memory identified from ctx.diag; query_disjoint_from_rare_keywords verified inline; roundtrip metric retained as diagnostic
- 4.24: 4-domain protocol; text-identity labeling; held-out subset metric

Results on ckpt/v344_trained.pt (same weights, AMS_DETERMINISTIC=1):
- 19/26 pass, 1435.3s (v3.45-runner-update was 19/26, 1476.3s)
- No case changed pass/fail status. Meaning of each passed case is now stronger.

Key numeric outcomes:
- 4.22 PASS under de-overfit: set A delta=11.0, set B delta=10.0 (held-out at equal magnitude, selection bias refuted)
- 4.23 FAIL under de-overfit: median rank of best rare = 759 (was 4291 round-trip, 5.7x improvement with paraphrase)
- 4.24 FAIL (4-domain), held-out component PASS:
    loo_nn_accuracy_all_4 = 0.625 (threshold >=0.65)
    loo_nn_accuracy_heldout_2 = 0.875 (threshold >=0.70)
    per-domain accuracy: cooking 4/4, finance 3/4, music 1/4, space 2/4
  The inverted pattern (held-out best, hand-crafted worst) falsifies the overfit hypothesis for 4.24.

No SUT code changed (per user constraint). Only runner + spec.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…ned encoder by 30% rel

Runner-only change. Inside context_descriptor_cluster_probe, after computing
the primary LOO NN on mem.context_descriptor, the runner also computes LOO NN
on mem.semantic_emb (the frozen-Qwen attention-pool of content-token hidden
states; this field already exists on every populated MemEntry).

Same ckpt/v344_trained.pt, same v3.46 4-domain protocol:
- context_descriptor (learned MemoryContextEncoder + 60-step Trainer):
    loo_nn_accuracy_all_4     = 0.625 (10/16) -- FAIL
    loo_nn_accuracy_heldout_2 = 0.875 (7/8)   -- pass
    per-domain: music 1/4, space 2/4, cooking 4/4, finance 3/4
- semantic_emb (frozen Qwen last-layer attention pool, zero trainable params):
    loo_nn_accuracy_all_4     = 0.812 (13/16) -- PASS
    loo_nn_accuracy_heldout_2 = 0.875 (7/8)   -- pass
    per-domain: music 3/4, space 3/4, cooking 4/4, finance 3/4

Delta +0.188 absolute (+30% relative). Music domain +0.50.

Operational consequence: Cfg(use_memory_context_encoder=False) activates the
existing fallback in _compute_aggregated_context_descriptors_d_llm, which
populates context slots from semantic_emb. No SUT code change. Next audit
prediction: 4.24 FAIL -> PASS, total 19/26 -> 20/26.

Overall: 19/26 (same total as v3.46; primary criteria unchanged).

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…diction partially refuted)

Training driver train_v348.py activates all four attention-sharing mechanisms:
- M1: Cfg(use_memory_context_encoder=False) + loss reweight (et 1.5->3.0, sa 3.0->1.0, tsa 0.5->0.1, fs 0.4->0.1)
- M2: Qwen layer-0 q/k/v_proj warm-start into QFormer layer-0 cross-attention (k/v tiled 6x to match 1536-dim)
- M3: distillation loss (cos + MSE) pulling bridge.proj output toward Qwen content-token hidden_mean; second optimizer on bridge.proj params only
- M4: bridge.proj.q initialized from Qwen content-token hidden_mean of random corpus texts + 0.005 noise

Runner change: 4.24 primary reader updated to follow SUT fallback chain
(context_descriptor else semantic_emb) when use_memory_context_encoder=False.
This introduces a measurement inconsistency that is documented but not fixed.

Training: 120 steps, 2685.8s (44.8 min), 22.4 s/step single-threaded.
Final training metrics (vs v3.44-Trained @ 60 steps):
  total_loss:     44.0 -> 17.5  (2.5x deeper)
  recon_loss:      4.8 -> 2.08  (2.3x lower)
  vocab_anchor:  -0.22 -> -0.33 (50% deeper)
  bridge cos(Qwen-pool): new signal, peaked at 0.87, sustained 0.77

Audit: 26 cases, 1423.8s, 19/26 pass. Unchanged from v3.46 and v3.47.

Delta analysis:
  4.24 primary all_4:     unchanged 0.625 (measurement issue in runner)
  4.24 primary heldout_2: 0.875 -> 0.750 (REGRESSION from M3 target mismatch)
  4.24 diagnostic all_4:  0.812 (matches v3.47 prediction, confirms M1 in principle)
  4.23 median rank:       759 -> 1089 (REGRESSION from M2+M3 pulling tail slot toward Qwen mean)

Mechanism diagnosis:
- M1 (disable learned encoder) works structurally: the diagnostic metric reading mem.semantic_emb achieves 0.812/0.875 LOO NN, same as v3.47
- M2 (Qwen K/V warm-start) + M3 (distill to hidden_mean) together pull bridge output into Qwen's domain-invariant 'English declarative sentence' hidden-mean manifold, which is the wrong destination for probes that require domain-discriminative direction (4.23, 4.24 heldout)
- M4 (pool-init queries) neutral
- Net: +1 (M1) - 2 (M2+M3) = -1 vs v3.47 prediction; observed 19/26

Falsifiable next steps (not in this PR):
- Revert M2+M3, keep M1+M4: predicted 20/26
- Change M3 target to WTE-centroid-of-strict-content-starters: predicted >= 20/26
- Fix 4.24 primary reader to uniformly follow SUT fallback: predicted 20/26 on current ckpt

Artifacts: ckpt/v348_stacked.pt (453 MB, not tracked), ckpt/v348_train_log.jsonl,
reports/v348_stacked_blackbox/*.

No SUT code changed (per user constraint).

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…tution ban

Runner (v331_blackbox_eval.py, context_descriptor_cluster_probe):
- Removes the v3.48 fallback that read mem.semantic_emb when
  mem.context_descriptor was None (i.e., when the SUT is configured
  with Cfg(use_memory_context_encoder=False)). This fallback laundered
  a FAIL-by-API-contract into a numerical-value-lookalike PASS and
  violated SPEC Section 1.1.3 (no audit-time-only code paths).
- Primary metric now reads MemEntry.context_descriptor literally.
  If fewer than 8 entries are populated, status is 'not_implemented'
  (was already so in some paths; now uniformly so for the disabled-
  encoder case).
- Diagnostic block reading semantic_emb is preserved but now clearly
  labelled as non-gating and named mechanism_1_qwen_pool_diagnostic.
  Runs regardless of primary-metric status so mechanism design still
  has data.
- Bumps metric_version to v3.49.

SPEC (V331_BLACKBOX_TEST_SPEC.md):
- Section 4.24 gains a 'Substitution ban (v3.49+)' paragraph that
  explicitly forbids substituting any other MemEntry field for the
  primary metric, and explains why 'follow the SUT's own operational
  fallback chain' is not a valid justification.
- Section 7.9 added: retraction notice for the v3.48 4.24 primary
  metric and for any overall pass count that relied on it.

No SUT change. No mocks. No checkpoint deletions.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
… inter-domain margin / D deterministic save-load / E top1-exclusive bias / F circuit breaker

Target 7 persistent FAILs in v3.48 audit (4.7/4.11/4.13/4.16/4.19/4.23/4.24).

[A] MemoryContextEncoder: replace single orthogonal Linear with 1-layer
    attention pool. Q=learnable Parameter(d_ctx); K,V=Linear(d_LLM, 2*d_ctx)
    over content-token hidden states; residual shortcut via orthogonal
    proj_wte(wte_centroid) at weight 0.3. write() path passes content
    hidden states per-batch.

[B] ContentSemanticTailHead.combine_with_residual: slot_1..n-1 =
    alpha * rare_keyword_residual + beta * LN(tail_head_output), with
    per-slot learnable beta (init 0.3) and LayerNorm on head_out to bound
    magnitude. slot_0 stays pure head_out. New
    Trainer.slot_residual_alignment_loss = relu(floor - cos(slot, residual))
    at floor=0.5.

[C] Inter-domain margin: AMM.maybe_recluster triggers KMeans on
    semantic_emb every mem_recluster_every_writes=4 writes, stamping
    MemEntry.cluster_id. DirectionTree.retrieve and
    AMM.retrieve_multi apply retrieval_crowding_lambda=0.15 penalty to
    cross-cluster entries. Trainer.inter_domain_margin_loss uses same
    KMeans weak labels for fiber-direction margin (same>=0.6, cross<=0.3).

[D] Deterministic save/load: PrefixAligner._calibrated flag prevents
    recalibration; save/load iterate mid-sorted; _sorted_set replaces
    list(set()) on all token-id unions; ContentTokenClassifier exposes
    SHA256 fingerprint, saved+verified on load; store dump includes
    SHA256 fingerprint for double-save stability check.

[E] Content bias top-1 exclusive + rest fallback:
    b = 0.7 * build(top1, floor=0.5) + 0.3 * build(rest, floor=0.2).

[F] CircuitBreaker in MemLLM.generate: records -log P(chosen) per step,
    baseline = first 3 steps mean. 3 consecutive steps above
    1.5 * baseline flip active; 5-step hysteresis. When active,
    mixture_gate ceiling clamped to 0.3 (only affects mixture path if
    use_mixture_decoding enabled).

No runner/spec changes. Same SUT entry via AgentMemorySystem.py.
Ready for v3.49-runner audit on fresh-init + trained-ckpt.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…ns diagnostic getters

These pre-existing pure tree-topology inspectors are depended on by probes
4.1 (leaf_capacity_stability) and 4.2 (degenerate_direction_boundary).
The rewrite inadvertently dropped them; restored verbatim.

No audit-time-only semantics: max_depth() and leaf_size_violations()
only read existing _Node tree structure, which is the same code path the
SUT uses at runtime (insert/split/rebalance). §1.1.3 clear.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Total pass: 18/26 (v3.48 stacked-trained was 19/26).
Elapsed: 1519 s on CPU. Deterministic mode active.

Head-to-head vs v3.48:
  UP (+2):   4.24 context_descriptor_cluster_probe (FAIL -> PASS)
             4.16 retrieval_generation_alignment_audit (FAIL -> PASS)
  DOWN (-3): 4.8  degeneration_quality (PASS -> FAIL)
             4.21 decode_repetition_feedback_probe (PASS -> FAIL)
             4.25 prefix_length_scaling_probe (PASS -> FAIL)

FAIL signatures:
  4.24 -> PASS: loo_nn_all_4 = 0.9375 (15/16), heldout = 1.0 (8/8).
    [A] attention-pool ctx encoder with residual shortcut produced the
    intended gain. Primary metric now exceeds v3.48 Qwen-pool diagnostic
    (0.81) on same corpus, under v3.49 no-substitution rule.
  4.16 -> PASS: diagnoses = {aligned:2, bridge_unused:1, retrieval_miss:0}.
    [C] inter-domain margin + crowding prevented the music<->space mix on
    the satellites prompt.
  4.8  -> FAIL: outputs show repetition 'pian pian Chop pian noct pian...'.
    avg_max_repeat=4.33 (>3) and avg_unique_ratio=0.25. [E] top1-exclusive
    content_bias at weight 0.7 + floor 0.5 concentrates mass on the
    dominant memory's top starters, which the repetition guards cannot
    pull apart at this scale.
  4.21 -> FAIL: same repetition cascade (avg_max_repeat_per_content_token
    = 4.33, threshold 3). Downstream of the same [E] concentration.
  4.25 -> FAIL: mass_B/mass_A = 1.065, threshold 1.10. [B] residual-
    dominant tail_slot at fixed alpha=1.5 and beta=0.3 bounds the extra
    mass from doubling L_mem: extra tail slots now contribute mostly
    clamped residual + small beta*LN(head), not free head output, so the
    starter-mass ratio flattens toward 1.0.

Persistent FAILs (unchanged from v3.48):
  4.23 keyword_specific_tail_slot: median_rank = 1402 (was 1089).
    [B] alignment by cosine is not the same as WTE-rank recovery; the
    rank metric still reads the post-LN combined slot, which is near
    residual direction only by cosine, not in the raw logit argmax.
  4.11 retrieval_topk_semantic_shift: both hit counts still 0. prefix
    continues to route to meta-starters, independent of [C]/[E].
  4.13 save_load_consistency: output_a != output_b still differ; [D]
    fingerprint-stable save but generate() stochasticity at bf16 not
    fully pinned.
  4.19 stepwise_label_mass_alignment_audit: label-mass trajectory
    mis-aligned; cascade of 4.11.
  4.7  semantic_memory_counterfactual_pairs: repetition garbage, same
    root cause as 4.8/4.21.

Axes (v3.49 runner reporting):
  A compression: ratio 8.97 < 10 FAIL (ctx_desc added floats)
  B injection:   164224 per-step, O(1) in N, PASS
  C fidelity:    6/11, threshold 9 FAIL
  D stability:   1/3 PASS (save_load + decode_repetition FAIL)

SUT fresh-init; no training; no ckpt. The [A] win validates the
attention-pool mechanism design; the DOWN triplet (4.8/4.21/4.25)
shows [E]/[B] changes overshot without a counterweight on repetition
and mass preservation.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…write)

[#1] Revert [B] residual-dominant tail-slot decomposition.
  Cfg.tail_slot_residual_dominant: True -> False.
  loss_weights['slot_residual_alignment']: 0.3 -> 0.0.
  In v3.44-rewrite the combine_with_residual path produced
    slot_1 = alpha*residual (L2=1.07) + beta*LN(head_out) (L2=11.76)
  so LN(head_out) dominated the direction.  On fresh init with
  zero-init slot_heads[1], LN(0) reduces to LayerNorm gamma direction
  (uniform), which is far from every rare-keyword WTE direction, so
  4.23 median_rank went to 1402 (v3.48 baseline 1089).
  Disabling the decomposition routes EmbBridge.inject back to the
  additive path: slot_1 = tail_head(fiber) + alpha * residual, which
  in fresh init equals alpha * residual and points by construction
  at the rare-keyword centroid direction.

[#3] Refresh rare_keyword_ids at end of write().
  MemLLM.write() now calls self._refresh_rare_keyword_indices()
  after the last store_mem, so fresh-path and load-path both compute
  rare_keyword_ids via the same algorithm at the same timing.
  Pre-patch: write() left MemEntry.rare_keyword_ids=[] (set by
  store_mem), while load_memory() called _refresh_rare_keyword_indices
  after loading, leaving model_a and model_b with different
  rare_keyword_ids for the same mid -> _compute_rare_keyword_wte_residual
  returned None for model_a (empty lists) and a non-zero tensor for
  model_b, diverging prefix_cond -> 4.13 FAILs by string-inequality
  under greedy decoding.

Diagnostic: diag_4_13_rare_keyword_equiv.py verifies after #3 that all
per-memory fields (base/fiber/dirn/semantic_emb/context_descriptor/
content_token_ids/expanded_content_ids/strict_starter_ids/
rare_keyword_ids) are bit-identical between fresh+save and load on
corpus_general (the corpus 4.13 writes).  The script runs to CLEAN.
This does not guarantee 4.13 will PASS -- it only confirms the known
source is closed.  Remaining sources, if any, live downstream of
MemEntry fields in the bridge / aligner / or backbone path.

No changes to:
  - [A] attention-pool ctx encoder
  - [C] inter-domain margin + cluster crowding
  - [E] top1-exclusive content_bias
  - [F] circuit breaker (still hooked only to mixture_gate ceiling,
    use_mixture_decoding=False by default -> still a dead path)
  - runner
  - SPEC

Scope: exactly two Cfg flags and one call-site added.  Structural
risk: minimal (one is a revert, one is a timing alignment).

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Targets directly hit:
  4.13 save_load_consistency  : FAIL -> PASS (outputs bit-identical)
  4.25 prefix_length_scaling  : FAIL -> PASS (mass_B/mass_A = 1.543 >= 1.10)

Targets held (no regression from v3.44-rewrite):
  4.24 context_descriptor_cluster_probe: PASS (0.9375 / 1.0)
  4.16 retrieval_generation_alignment_audit: PASS

Targets still FAIL (same as v3.44-rewrite, unaddressed by #1/#3):
  4.23 keyword_specific_tail_slot_probe: median_rank=1402, hit=0
  4.8 / 4.21 / 4.7  : decoder repetition triple (will be addressed by #2)
  4.11 / 4.19       : prefix-token-class mismatch (will be addressed by #5)

Surprising finding on 4.23:
  The diagnostic dump (diag_4_23_slot_direction.py) reveals that
  bridge._last_tail_slots read by 4.23 does NOT come from prefix_cond -
  it comes from the SECOND inject call inside _build_contrastive_uncond_prefix,
  which is called with rare_keyword_wte_residual=None.  This overwrites
  _last_tail_slots and _last_residual with the uncond contrastive prefix's
  values.  The probe has been reading the uncond tail since at least v3.42.
  This is a pre-existing diagnostic-buffer aliasing bug, not a change-#1
  regression.  It explains why v3.48 (median_rank=1089) and v3.45
  (median_rank=1402) both point at whitespace/punct - both are reading
  tail slots that were rebuilt without rare-keyword residual.
  Fix belongs in a separate PR (write residual to a second buffer in
  cond path, or snapshot bridge._last_tail_slots before uncond inject).

axis_coverage under v3.49 runner reporting:
  A compression   : ratio 8.97 (< 10)     FAIL
  B injection     : 164224 floats, O(1)   PASS
  C fidelity      : 7/11 (threshold 9)    FAIL
  D stability     : 2/3 (4.21 FAIL)       FAIL

elapsed: 1508 s on CPU, AMS_DETERMINISTIC=1, fresh init.

This audit validates:
  - #1 revert did not regress anything and recovered 4.25 (predicted by
    the plan's 'LN-bounded extra slot mass' magnitude calculus).
  - #3 refresh timing alignment recovered 4.13 (predicted by the plan's
    'rare_keyword_ids fresh-vs-load asymmetry' mechanism).

This audit does not validate:
  - any claim about 4.23 reachability; 4.23 has a pre-existing aliasing
    bug that the current plan's change #2 ([B] replacement) cannot fix
    because the replacement would still be overwritten by the uncond
    inject call.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Problem: MemLLM.prepare_decode_context calls EmbBridge.inject twice --
once for prefix_cond (with rare_keyword_wte_residual=residual), and then
in _build_contrastive_uncond_prefix a second time with rare_keyword_wte_residual=None.
Both writes go to the same buffers bridge._last_tail_slots, _last_residual,
etc, so the second call clobbers the first.  Case 4.23 reads
bridge._last_tail_slots AFTER prepare_decode_context returns and therefore
always sees the uncond prefix's tail slot, which by construction carries
no rare-keyword signal.  Observed: top-5 = [' ', ',', '.', ' (', '1'] on
both v3.44-rewrite (median_rank=1402) and v3.48 (median_rank=1089);
neither number tells us anything about whether the cond-path tail
carries rare-keyword information.

Minimal fix, strict scope:

SUT (scheme_b_v344.py):
  - EmbBridge.__init__: add _last_cond_fiber_summary / _last_cond_tail_slots /
    _last_cond_context_slot / _last_cond_tail_pre_renorm / _last_cond_residual /
    _last_cond_inject_diag (all None or {}).
  - EmbBridge.inject signature: + is_cond_path: bool = True
  - EmbBridge.inject epilogue: when is_cond_path=True, mirror
    self._last_* into self._last_cond_*.  When False, only the shared
    _last_* are written (unchanged).
  - MemLLM._build_contrastive_uncond_prefix: pass is_cond_path=False on
    its inject call.  Default True everywhere else covers training and
    the main prefix_cond path.

Runner (v331_blackbox_eval.py):
  - keyword_specific_tail_slot_probe: add local helper
    _get_tail_slots_cond_preferred that returns bridge._last_cond_tail_slots
    if present, else bridge._last_tail_slots.  Used in both paths
    (roundtrip and paraphrase).
  - Emit 'tail_slots_source' in the probe return payload so the audit
    report records which buffer was actually read.
  - metric_version bumped to v3.50 to mark the source change.

No Cfg change.  No algorithm change.  No SPEC change.  Training path
untouched (defaults to is_cond_path=True, which mirrors to _last_cond_*;
since audit probes always re-run prepare_decode_context before reading,
training-time mirror state is never observed by audit code).

Pre-audit verification (diag_4_23_cond_buffer.py):
  query 1: She performed Beethoven sonatas with delicate phrasing...
    _last_tail_slots slot_1 L2=0.0000       top5=[' ', ',', '.', ' (', '1']
    _last_cond_tail_slots slot_1 L2=1.0251  top5=[' control', ' Control', '控制', 'control', 'Control']
    rank of 'control' = 1         (was 1402)
    top20 ∩ rare_dom = {2524}      size=1
  query 2: Harmonic analysis and ear training...
    same pattern, rank of 'control' = 1

This is sufficient to make 4.23 measurable.  Whether 4.23 PASSes under
the primary metric is now a function of the cond-path algorithm, not
of which buffer the probe happens to read.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
… regressions)

4.23 FAIL -> PASS.  Primary metric numbers under the corrected buffer:
  tail_slots_source = bridge._last_cond_tail_slots   (new)
  mean_intersection_size_top20_paraphrase = 1.0      (threshold >= 1.0)
  median_rank_of_best_rare_paraphrase = 1.0          (threshold <= 100.0)
  hit_ratio_at_least_one_top20_paraphrase = 1.0      (threshold >= 0.5)
  n_paraphrase_queries_evaluated = 2

This matches the pre-audit diag_4_23_cond_buffer.py output:
  rank of ' control' = 1 on both paraphrases
  top-5 centered = [' control', ' Control', '控制', 'control', 'Control']
  top20 intersect rare_dom = {2524}

The result validates the causal claim made when the aliasing bug was
identified in the v3.45-revertB-refreshD audit: reverting [B] (cfg
tail_slot_residual_dominant=False) was a prerequisite for 4.23
reachability, but the uncond-inject buffer clobber was blocking the
measurement entirely.  Both together are required.

axis coverage v3.49 runner reporting:
  A compression: 8.97 / 10.0     FAIL
  B injection:   164224 per-step  PASS  (O(1) in N)
  C fidelity:    8/11 / 9         FAIL  (was 7/11, 4.23 added)
  D stability:   2/3               FAIL  (4.21 still FAIL)

Remaining FAILs, unchanged from the prior audit:
  4.7  semantic_memory_counterfactual_pairs  (repetition garbage)
  4.8  degeneration_quality                   (repetition, same root as 4.7)
  4.11 retrieval_topk_semantic_shift          (prefix to meta-starter mismatch)
  4.19 stepwise_label_mass_alignment_audit    (cascade of 4.11)
  4.21 decode_repetition_feedback_probe       (repetition, same root as 4.7/4.8)

These five are the cases that plan #2 (narrow E) and #5 (rare_keyword
floor) were designed to address.  They are independent of the 4.23
fix in this PR.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Single Cfg flip:
  use_top1_exclusive_content_bias: True -> False

Rationale (from v3.45-cond-buffer post-audit review):
  [E] was the sole cause of the 4.7 / 4.8 / 4.21 regressions introduced
  in v3.44-rewrite. With top1_weight=0.7 and top1_relevance_floor=0.5
  enabled the content_bias on ~8 top-1 tokens reached ~+22 logit while
  content_repeat_penalty=2.5 only wins the decode race at k>=10, but
  cyclic_content_max_count=5 hard-masks at k=5, leaving 5 steps where
  the bias outran anti-repetition.  Observed output:
    'The pianist practiced pian pian Chop pian noct pian midnight Chop Chop noct'

  [E] was originally credited (alongside [C]) with the 4.16 flip from
  v3.48 to v3.44-rewrite.  Re-examination of 4.16's diag
  (retrieval_miss=0, retrieved_majority correct on space prompt) shows
  the flip is entirely attributable to [C] cluster-crowding at the
  retrieval stage, which does not depend on [E].  [E] was adding
  concentration on top of an already-fixed retrieval, and the
  concentration broke the decode-race balance.

Revert restores the aggregated top-k path that v3.48 and v3.42 used,
both of which PASSed 4.7 / 4.8 / 4.21.  This is a revert, not a new
mechanism, and it does not touch:
  - [A] attention-pool ctx encoder (4.24 carrier)
  - [C] inter-domain margin + retrieval crowding (4.16 carrier)
  - [D] write-time rare_keyword refresh (4.13 carrier)
  - [B-revert] combine_with_residual disabled (4.23 + 4.25 carriers)
  - [v3.45 cond-buffer] cond-path inject mirror (4.23 measurability)

4.11 / 4.19 are not addressed in this revert.  Fresh-init prefix cannot
transmit lexical content through 28 Qwen layers at a magnitude above
the modal-starter baseline without training; that is a channel-level
pre-training gap, not a Cfg-level fix.  If 4.11 / 4.19 remain FAIL
under this revert, the plan per SPEC Section 7.7 is to report them as
'pre-training gap on axis C' rather than add test-directed decode-time
mechanisms.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Revert [E] was a structural cleanup (removing a test-directed Cfg
addition from v3.44-rewrite) but did not move any primary metric
number in the fresh-init audit.  All 5 FAILs from v3.45-cond-buffer
remain FAIL with matching signatures.

Prediction postmortem:
  I claimed 'v3.48 baseline had 4.7/4.8/4.21 all PASS under
  top1_exclusive=False, so the revert returns to a known-viable
  point'.  This is wrong: v3.48 was 120-step trained, v3.46 is
  fresh init.  Same Cfg, different model state.

The fresh-init repetition signature (4.7/4.8/4.21):
  4.8  avg_unique_token_ratio = 0.343  (threshold >= 0.35, diff 0.007)
  4.21 avg_max_repeat = 4.67           (threshold <= 3, diff 1.67)
  4.7  music_margin = space_margin = 0 (threshold > 0)

Root cause identified: fresh init has tail_head.slot_heads[1] =
zero-init and vocab_proj = zero-init.  Both are the learned
dilution mechanisms that would distribute content_bias mass across
the vocabulary during decode.  Without them, aggregated top-k
content_bias concentrates on ~12 keywords (music/space corpus word
set) and content_repeat_penalty = 2.5 * k only wins at k >= 6-7,
while generation length is 20 steps -- the race stays locked in
repetition.  This is not a Cfg-level bug; it is what an untrained
channel looks like.

Confirmed: 4.16 is carried entirely by [C] cluster-crowding, not
by [E].  After [E] revert, 4.16 still PASSes (retrieval_miss=0),
which matches the v3.44-rewrite diag (aligned:2, retrieval_miss:0,
bridge_unused:1).

All five remaining FAILs (4.7 4.8 4.11 4.19 4.21) are identified
as pre-training gaps:
  axis C pre-training gap: 4.7 4.8 4.11 4.19
  axis D pre-training gap: 4.21 (repetition race same root as above)

This branch is the cleanest fresh-init starting point for running
the Trainer: [A] attn-pool + [C] cluster-crowding + [D] refresh-
timing + [B-revert] + cond-buffer, without the v3.44-rewrite [E]
test-directed addition.

GPU status: not available in this cloud agent VM.  Training is
blocked on a GPU-enabled instance being configured for this agent
at cursor.com/onboard.  Audit iteration on fresh init has reached
its ceiling at 21/26.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Self-contained context document for a new cloud agent with a
GPU-enabled instance to pick up from where this CPU-only sprint
finishes.  Covers:

- Current state (v3.46, 21/26 fresh-init ceiling, which mechanisms
  are carriers of which audit passes)
- Sprint timeline (v3.44-rewrite -> v3.45-revertB-refreshD ->
  v3.45-cond-buffer -> v3.46) with branch names, PR numbers,
  audit deltas, and per-change root cause
- Five prediction errors made during the sprint, categorized into
  unit mismatch / scope mismatch / magnitude blindness /
  regression blindness / dead-path errors
- Three anti-patterns to avoid (threshold chasing, decode-time
  metric patching, dead-Cfg-path mechanisms)
- Five remaining FAILs (4.7 / 4.8 / 4.11 / 4.19 / 4.21) root-caused
  to two zero-init dilution paths (tail_head.slot_heads[1] and
  vocab_proj.proj[-1]) that only training can activate
- Training protocol: train_v346.py skeleton, checkpoint location,
  audit re-run command sequence, what NOT to change post-training
- Explicit list of open PRs (#23-#27) and suggested child-branch
  naming for the GPU agent
- Sanity prompts to run before starting training
- Scope limits: no Delta prediction, no 'channel works' phrasing,
  no post-hoc Cfg tuning unless it is a revert with structural
  justification

No SUT/runner/SPEC changes in this commit.  Pure documentation.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
cursor Bot pushed a commit that referenced this pull request Apr 22, 2026
Child PR of #27.  Training driver train_v346.py run for 60 steps on NVIDIA H200
(vast.ai), elapsed 335 s, mechanism observables per \u00a75.6 moved into target
range (tail_head slot1 |w|_mean: 0 -> 7.30e-4;  vocab_proj |w|_mean: 0 -> 5.49e-4,
both in [1e-4, 1e-2]).  Necessary conditions met; sufficient: not.

Audit with AMS_TRAINED_WEIGHTS=ckpt/v346_trained.pt, AMS_DETERMINISTIC=1,
elapsed 1250 s.

Results (as data, per SPEC \u00a77.7 norm, no Delta-pass-count was predicted):
  PASS 18, FAIL 8 (was 21, 5).
  Zero cases flipped FAIL -> PASS.
  Three cases flipped PASS -> FAIL:
    4.17 retrieval_prefix_decode_correlation_audit  (prefix_l2_shift = 3.22e+11,
         correlation undefined -- trained prefix magnitude blew up)
    4.20 rerank_stability_probe                     (space_P2 jaccard 0.429 < 0.6)
    4.25 prefix_length_scaling_probe                (L_mem 8->16 reduces starter
         mass to 0.82x, probe requires >1.10x)
  Regressions 4.8/4.21 also got worse: 'The pianist' unique_ratio 0.343 -> 0.296,
  avg_max_repeat 4.67 -> 5.0.  Axis C: 8/11 -> 6/11.  Axis D: 2/3 -> 1/3.

Structural read (\u00a71.5): 60 steps on 12-text corpus with semantic_alignment
weight 3.0 and no prefix-norm constraint caused the ctx encoder to saturate
prefix magnitude while tail/vocab paths gained just enough weight to reinforce
the corpus's own repetition pattern.  This is \u00a75.7 option-A territory
(pre-amplification gap) confirmed with data rather than predicted.

Artifacts committed:
  reports/v346_trained_blackbox/report.{json,md}
  reports/v346_trained_blackbox/stdout.log
  reports/v346_trained_blackbox/train_log.jsonl
  reports/v346_trained_blackbox/train_stdout.log

No Cfg changes (\u00a75.4), no Trainer loss additions (\u00a75.4).  ckpt/v346_trained.pt
is git-ignored per existing ckpt/*.pt rule; provenance recorded in the torch.save
blob and in report metadata.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants