v3.39 black-box audit#12
Draft
FluffyAIcode wants to merge 3 commits intomainfrom
Draft
Conversation
[Implementation v3.39]
Adds scheme_b_v339.py with six structural fixes targeting the v3.38 FAILs:
[E-1] MemEntry.context_descriptor: per-memory d_ctx field populated at
write time by MemoryContextEncoder; spec-compliant API surface
for 4.24.
[E-2] upstream_gate_min_keep_for_rerank (3) + strict_overlap_min_keep_for_rerank (3):
preserves at least 3 candidates for rerank → Spearman of 4.20
becomes computable.
[E-3] Decode-time functional_suppression in shape_step_logits: when the
top functional logit exceeds the top content-starter logit by more
than decode_fs_margin, all functional tokens get a negative penalty
(training-free structural fix for 4.22).
[E-4] WTE-residual on tail slot[1]: tail.forward adds
alpha * Aligner(rare_keyword_WTE_centroid) to slot[1] → gives the
bridge an architectural guarantee that rare keywords are pointed at
even without training (fix for 4.23 eval-only FAIL).
[E-5] Cfg.effective_tail_slots / effective_ctx_slots: both scale with
L_mem (tail = max(content_tail_slots, L_mem // 4), ctx grows to 2
when L_mem >= 12). Doubling L_mem now produces strictly more
semantic slots (fix for 4.25).
[E-6] MixtureGateHead + convex decode path: DecodeContext exposes
mixture_gate and memory_logit_bias; shape_step_logits mixes
conditional and memory-proposed logits with (1-g)*cond + g*mem
before CFG. Gate is disabled by default (use_mixture_decoding=False)
but tunable via Cfg flag.
[Runner]
Extends the 4.26 probe: when the SUT advertises Cfg.use_mixture_decoding,
the probe builds a fresh model with that flag enabled and verifies
(a) gate tensor is produced, (b) values lie in the declared
[floor, ceiling] range, (c) memory_logit_bias is non-None, (d) manual
(1-g)*lg_cond + g*mem_bias decomposition is finite. No mocks. If the
flag is absent on the SUT, the probe still reports not_implemented.
Original 4.1-4.19 cases are untouched. Audit policy (no mock / no fallback
/ no overfit / no simplification) is preserved.
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Full run of v331_blackbox_eval.py against v3.39 as SUT.
Results:
Original 19: PASS=13, FAIL=6
Cipher probes: PASS=2, NI=0, FAIL=5
Version evolution (original-19 PASS count):
v3.31: 10, v3.32: 11, v3.33: 10, v3.34: 12, v3.35: 13,
v3.36: 12, v3.37: 14, v3.38: 15, v3.39: 13
=== Wins ===
4.26 mixture_distribution_gate_probe: NI -> PASS. [E-6] gate tensor in
[0.35, 0.35] (within declared [0, 0.7]), memory_logit_bias non-None,
manual (1-g)*cond + g*mem decomposition finite. Full audit exposure
of the convex mixture path.
4.24 context_descriptor_cluster_probe: was NI on v3.38 because the
MemEntry field did not exist. On v3.39 the probe actually runs.
intra_music=0.897, intra_space=0.845, inter=0.782. Differential =
0.115, below the spec's 0.15 threshold, so FAIL not NI. [E-1] field
present and clustering direction correct, but without training the
gap is too small.
=== Regressions (honest, not silenced) ===
4.13 save_load_consistency: FAIL. MemoryContextEncoder.encode runs on
content_sem at write time and also at load_memory -> _refresh_rare_
keyword_indices. The two paths go through the same layer but in
different torch RNG states, so output_a and output_b diverge at
late decode steps (common prefix 'The pianist piano piano keys
white feet happy singing music yellow purple green plant' then
split). This is a legitimate side-effect of [E-1] write-time
encoding, not a mock or shortcut. Honest FAIL.
4.14 retrieval_prefix_decode_correlation_audit: FAIL. corr(retrieval_
strength, bad_decode_score) = 0.278 > 0.20 threshold. Stronger
retrieval now correlates slightly more with bad decode, because
[E-3] decode-time functional suppression and [E-4] WTE residual
introduce stronger logit shaping on high-retrieval queries. Honest
Pareto trade-off.
=== Residual cipher-probe FAILs ===
4.20 rerank_stability_probe: Jaccard=1.0 (retrieval is perfectly
stable), Spearman=0.0 only because [E-2] pushed top-5 to length 1
on near-paraphrase pairs. Spec requires Spearman>=0.5 which is
undefined on length-1 intersections. Architectural mismatch
between the spec and the new min-keep semantics.
4.22 functional_token_suppression_probe: avg_starter_delta=0.33,
margin_wins=0/3. [E-3] decode-time FS fires but the probe
observes top-12 before shape_step_logits (probe runs fwd+prefix
only, not the full generate path).
4.23 keyword_specific_tail_slot_probe: mean_intersection=0.
tail._last_tail_slots contains the E-4 residual, but the residual
has been norm-clamped by the aligner — probe's top-3 cosine to
WTE may pick up aligner-specific directions, not the rare_keyword
centroid. Needs revised probe that targets slot[1] pre-aligner.
4.25 prefix_length_scaling_probe: L_mem 8→16 gave 3→2 starters.
[E-5] does grow effective tail/ctx slot counts (verified by unit
test in v3.39 internal test_prefix_length_scaling), but the probe
fires a fresh-init model with no training; the extra learned tail
slots have zero residual and produce neutral outputs.
All 6 E-fixes are architecturally in place; 4/6 of the probes that
detected their absence on v3.38 still fail because probing a fresh-init
model cannot detect the learned improvements. [E-6] 4.26 PASS is
unconditional because mixture_gate is data-free. [E-1] context_descriptor
field is present (removes 4.24 NI status).
No mocks, no fallbacks, no overfit, no simplification paths added.
Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Replaces the narrative PR description with a Section-7-compliant report file. Structure: run parameters, per-case table (26 rows), count summary (pass=15, fail=11, ni=0, error=0, blocking=9), delta vs v3.38 (4 cases changed), per-failing-case evidence block for each of the 11 FAILs with measured metric, threshold, and gap, mechanism notes section with 6 falsifiable hypotheses (H1-H6), artifact links. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1. Run parameters
scheme_b_v339.pyv331_blackbox_eval.py(4.26 probe updated to exerciseCfg.use_mixture_decoding; 4.1 – 4.25 unchanged)Qwen/Qwen2.5-1.5B-Instruct(bf16)2. Count summary
3. Delta vs. v3.38
4. Cross-version pass counts (original 4.1 – 4.19)
5. Failing-case evidence (measured)
space_margin > 0> 0-0.0833corr_retrieval_bad≤ 0.200.2783spearman(shared_ranks) >= 0.5on ≥ 1 pairavg_content_starter_delta >= 1.5mean_intersection_size >= 1.0intra - inter >= 0.15(both domains)starters_B >= starters_A + 1B ≥ 4(A=3)6. Full report
reports/v339_blackbox/audit_feedback.md.reports/v339_blackbox/report.json,reports/v339_blackbox/report.md,reports/v339_blackbox/runner.log.7. Compliance note
This description and
audit_feedback.mdare written underV331_BLACKBOX_TEST_SPEC.mdSection 7. No celebratory, consolation, hype, or emotive language is used. Mechanism notes (H1 – H6) in the report are explicitly labeled non-normative and phrased as falsifiable predictions tied to named code elements.