You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two of the coach's hard requirements pull in opposite directions today:
Grounding needs numbers in the text. The grounder binds citations by matching numeric spans in the answer against canonical data (GROUND-R2). No number in the prose → nothing to bind → citations=0.
The voice rules push numbers OUT. The coach prompt says "FIRST sentence never a number", "never write ctl/atl/...", "keep numbers in the background". A faithful model then writes warm, number-free prose — and grounding has nothing to verify.
The result (observed live, deepseek-v4-flash): a data-rich athlete asks about training load and gets confident prose with zero cited numbers — the engine's own e2e_smoke grounded-answer oracle only passes intermittently, and the false-degraded of #85 rides on the same collision. The single answer string is being asked to be BOTH the machine-verifiable artifact AND the gentle human message, and it cannot be both well.
Proposed solution: separate the verifiable layer from the shown layer
The model emits a technical/evidence layer (numbers + reasoning, read by the grounder and every downstream checker) and a visible layer (warm prose for the athlete). The technical layer is stripped before the answer reaches the user.
Owner's framing — an inline tagged block:
<technical_proof>fitness is 5.7 (ctl, as_of 2026-06-15); fatigue 4.8 — basis for the "building steadily" read</technical_proof>
You've been building steadily — your fitness is climbing while fatigue stays manageable.
Everything inside <technical_proof> is consumed by grounding/binding/observability and cut at the athlete-facing surface.
Design constraints (must be in the spec before code)
Grounding covers BOTH layers. Any number shown to the athlete in the visible prose must itself be grounded against canonical — the proof block does not get to "vouch" for an unverified visible number. A proof saying 42 while the visible text says 48 is a grounding failure, not a feature. The visible layer is what the athlete believes; it cannot be less verified than the technical one.
Fail-closed stripping (VOICE-R2). The technical block is full of exactly the internal jargon VOICE-R2 forbids in athlete output (ctl/atl/grounding/as_of/...). Stripping MUST be fail-closed: a malformed/unclosed/nested tag, or any residual tag-or-jargon after the cut, MUST scrub or refuse — never leak the raw technical layer. A hidden channel in LLM output is a classic leak vector.
Structured, not in-band, is safer. Raw in-text tags are fragile (model forgets to close, nests, mislabels, switches language). The robust form of the same idea is a structured output — e.g. {visible_answer, evidence_claims[]} — validated at the tool-call layer (STRUCT-R5 already mandates structured claim extraction). 2026 reasoning models already expose a separate reasoning channel and MODEL-R5a(b) already tells the engine to read the ANSWER, not the reasoning trace — so the infra for a two-layer answer is partly here. Tags-in-text is the owner's intuition; structured fields are its hardened implementation. (Spec discussion should pick one; structured is recommended.)
Plan (order matters — spec first)
Spec first. Edit doc 50 where the coach prompt says "don't lead with a number / never write codes" → reframe to "numbers + reasoning go INSIDE the technical layer; the visible layer is warm prose", and teach the grounding/voice requirements (GROUND-R2/R3, VOICE-R2, STATUS-R1) how the technical layer is read and how the visible layer is still fully grounded and fail-closed-stripped. The graders/checkers must be spec'd to read the tags and know why they exist.
Then code to the amended spec: compose emits two layers; grounder verifies the visible layer's numbers against canonical using the technical layer as the candidate-claim source; a fail-closed stripper removes the technical layer at the API boundary.
e2e_smoke grounded-answer oracle reliably GREEN: a data-rich athlete gets warm prose AND verified numbers, citations>0, every shown number canonical-matched.
No technical-layer text or internal jargon ever appears in athlete-facing output (fail-closed strip test).
Problem this solves
Two of the coach's hard requirements pull in opposite directions today:
citations=0.The result (observed live, deepseek-v4-flash): a data-rich athlete asks about training load and gets confident prose with zero cited numbers — the engine's own
e2e_smokegrounded-answer oracle only passes intermittently, and the false-degradedof #85 rides on the same collision. The single answer string is being asked to be BOTH the machine-verifiable artifact AND the gentle human message, and it cannot be both well.Proposed solution: separate the verifiable layer from the shown layer
The model emits a technical/evidence layer (numbers + reasoning, read by the grounder and every downstream checker) and a visible layer (warm prose for the athlete). The technical layer is stripped before the answer reaches the user.
Owner's framing — an inline tagged block:
Everything inside
<technical_proof>is consumed by grounding/binding/observability and cut at the athlete-facing surface.Design constraints (must be in the spec before code)
{visible_answer, evidence_claims[]}— validated at the tool-call layer (STRUCT-R5 already mandates structured claim extraction). 2026 reasoning models already expose a separate reasoning channel and MODEL-R5a(b) already tells the engine to read the ANSWER, not the reasoning trace — so the infra for a two-layer answer is partly here. Tags-in-text is the owner's intuition; structured fields are its hardened implementation. (Spec discussion should pick one; structured is recommended.)Plan (order matters — spec first)
terminal_statuscouples to real grounded substance (visible, verified numbers), and the eval grounder fixtures become realistic (survivors only when claims actually ground), so the false-degradeddisappears naturally.Acceptance
e2e_smokegrounded-answer oracle reliably GREEN: a data-rich athlete gets warm prose AND verified numbers,citations>0, every shown number canonical-matched.degradedresolved as a consequence.Raised by Opus 4.8 from the live onboarding loop; blocks #85.