Skip to content

Two-layer coach answer: verifiable technical/evidence layer (read by grounding) + warm visible prose (shown to athlete), fail-closed split #87

@bepcyc

Description

@bepcyc

Problem this solves

Two of the coach's hard requirements pull in opposite directions today:

  1. Grounding needs numbers in the text. The grounder binds citations by matching numeric spans in the answer against canonical data (GROUND-R2). No number in the prose → nothing to bind → citations=0.
  2. The voice rules push numbers OUT. The coach prompt says "FIRST sentence never a number", "never write ctl/atl/...", "keep numbers in the background". A faithful model then writes warm, number-free prose — and grounding has nothing to verify.

The result (observed live, deepseek-v4-flash): a data-rich athlete asks about training load and gets confident prose with zero cited numbers — the engine's own e2e_smoke grounded-answer oracle only passes intermittently, and the false-degraded of #85 rides on the same collision. The single answer string is being asked to be BOTH the machine-verifiable artifact AND the gentle human message, and it cannot be both well.

Proposed solution: separate the verifiable layer from the shown layer

The model emits a technical/evidence layer (numbers + reasoning, read by the grounder and every downstream checker) and a visible layer (warm prose for the athlete). The technical layer is stripped before the answer reaches the user.

Owner's framing — an inline tagged block:

<technical_proof>fitness is 5.7 (ctl, as_of 2026-06-15); fatigue 4.8 — basis for the "building steadily" read</technical_proof>
You've been building steadily — your fitness is climbing while fatigue stays manageable.

Everything inside <technical_proof> is consumed by grounding/binding/observability and cut at the athlete-facing surface.

Design constraints (must be in the spec before code)

  1. Grounding covers BOTH layers. Any number shown to the athlete in the visible prose must itself be grounded against canonical — the proof block does not get to "vouch" for an unverified visible number. A proof saying 42 while the visible text says 48 is a grounding failure, not a feature. The visible layer is what the athlete believes; it cannot be less verified than the technical one.
  2. Fail-closed stripping (VOICE-R2). The technical block is full of exactly the internal jargon VOICE-R2 forbids in athlete output (ctl/atl/grounding/as_of/...). Stripping MUST be fail-closed: a malformed/unclosed/nested tag, or any residual tag-or-jargon after the cut, MUST scrub or refuse — never leak the raw technical layer. A hidden channel in LLM output is a classic leak vector.
  3. Structured, not in-band, is safer. Raw in-text tags are fragile (model forgets to close, nests, mislabels, switches language). The robust form of the same idea is a structured output — e.g. {visible_answer, evidence_claims[]} — validated at the tool-call layer (STRUCT-R5 already mandates structured claim extraction). 2026 reasoning models already expose a separate reasoning channel and MODEL-R5a(b) already tells the engine to read the ANSWER, not the reasoning trace — so the infra for a two-layer answer is partly here. Tags-in-text is the owner's intuition; structured fields are its hardened implementation. (Spec discussion should pick one; structured is recommended.)

Plan (order matters — spec first)

  1. Spec first. Edit doc 50 where the coach prompt says "don't lead with a number / never write codes" → reframe to "numbers + reasoning go INSIDE the technical layer; the visible layer is warm prose", and teach the grounding/voice requirements (GROUND-R2/R3, VOICE-R2, STATUS-R1) how the technical layer is read and how the visible layer is still fully grounded and fail-closed-stripped. The graders/checkers must be spec'd to read the tags and know why they exist.
  2. Then code to the amended spec: compose emits two layers; grounder verifies the visible layer's numbers against canonical using the technical layer as the candidate-claim source; a fail-closed stripper removes the technical layer at the API boundary.
  3. Fold in Fully-grounded, gap-free, cited coach answer is non-deterministically stamped 'degraded' (residual of #45) #85: with the new contract, terminal_status couples to real grounded substance (visible, verified numbers), and the eval grounder fixtures become realistic (survivors only when claims actually ground), so the false-degraded disappears naturally.

Acceptance

Raised by Opus 4.8 from the live onboarding loop; blocks #85.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions