You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent models the athlete as {record + goal}, not as a human: no salient life-state model and no closed observe→adapt loop (absence & deviation are invisible) — so it optimizes a perfect athlete toward a perfect goal instead of coaching the real, imperfect one #79
Opened as the companion to #77 at the owner's request (see the discussion thread). #77 is the safety-shaped corner of this larger, deliberately broader issue.
"The goal of this project is not to be a mechanical predictor of next training based on numbers but to be an agent who acts very close to a real human coach … We support the goals of an athlete but he is a human being … He might disappear for days and we do not judge, we help to get on track. He might do something we did not recommend … that's fine, we adapt and keep helping." — project owner
The code does not yet support this, and the gap is architectural, not a missing feature. To the reasoning core, the athlete is{canonical metric record} + {typed active goals}. The human — personality, life-phase, real availability, motivation, and above all what they actually did this week — has no first-class representation, and there is no loop that adapts to it. The result is an open-loop optimizer: it computes a correct trajectory for a perfect athlete toward a perfect goal, and never closes the loop on the real, imperfect, busy human in front of it. Two primitives are missing.
Primitive A — there is no salient, durable model of the athlete-as-human
MemoryItem has no lifecycle either (columns: kind / content / inferred / timestamps — no status, no expiry), so the system cannot represent "this constraint is active / lifted / expired," which the human model needs.
Primitive B — there is no closed observe → adapt loop; absence and deviation are invisible
Absence is a data-quality question, never a human one."gap" in the codebase means only a coverage gap — a missing computation (_gap, capabilities.py:228). A behavioral gap (the human stopped for ten days) has no representation. The readiness path treats a stale record as an MNAR / "is the connector broken?" disambiguation (engine_readiness.py, READINESS_MAX_STALENESS_DAYS) — the README's "don't mistake a broken sync for rest," which is correct but stops there. Nothing asks "life happened — how do we re-enter gently?" The "proactive" engine is a one-screen status briefing (engine_proactive.py), not a re-engagement nudge.
Deviation is invisible. A plan_day is an IMMUTABLE prescription (GBO-R31 ORM guards, planning.py). The one adaptation primitive — schedule_adjustment (AdjustmentType = MOVE/SWAP_WORKOUT/SHORTEN/LENGTHEN/SKIP/REST, origin ATHLETE/AGENT, status PROPOSED/APPLIED) — is prospective and intentional: it edits the plan forward. There is no link from a plan_day to the Activity that actually fulfilled (or failed to fulfill) it, and no adherence/completion state derived from reality. The system has a steering wheel but no eyes on the road: it can propose a SKIP, but nothing observes that the athlete already skipped Tue/Wed/Thu and re-grounds from where they actually are.
The voice layer is presentation-only (verbosity / number-cap / leads-with-state, voice.py) — there is no model of encouragement or non-judgment, so "we adapt and keep helping, as human beings not machines" currently has nowhere to live in the architecture even if the data supported it.
Why this is grounded, not soft
The temptation is to dismiss "be more human" as un-groundable mood. It is not — the content of human-centered coaching is as well-established and citable as the PMC math the project already trusts:
What human state matters, and the non-judgmental stance: Self-Determination Theory (Deci & Ryan 2000 — autonomy / competence / relatedness as the substrate of durable adherence); Motivational Interviewing (Miller & Rollnick 2013 — the canonical framework for non-judgmental, athlete-led behavior change: exactly "we don't judge, we help get back on track"); the Transtheoretical / stages-of-change model (Prochaska & DiClemente 1983 — "meet the athlete where they are": a hobbyist with a job is in a different stage than a peaking pro, and the same numbers should yield different coaching).
How to adapt the plan to the real week: autoregulation / flexible periodization (Helms et al. 2016, RIR-based RPE, Strength Cond J 38(4); Kraemer & Fleck, flexible non-linear periodization) — fitting the plan to the human rather than the human to the plan. Most pointedly, HRV-guided training (Vesterinen et al. 2016, Scand J Med Sci Sports; Javaloyes et al. 2019, IJSPP) shows daily-readiness-driven plan adjustment outperforms a fixed predetermined plan — and the project already computes HRV and ACWR, so the evidence base and the inputs are both in hand; only the loop is absent.
Two primitives, each shippable in slices that stand on their own:
A1 — A structured, always-resident athlete-state model. Promote constraints + active goals into a MemGPT-style core tier present every run (this is exactly #77's Layer 0). Then extend the model beyond constraints to typed life-context fields a coach holds (real weekly availability, current life-phase / stages-of-change, motivation drivers, observed load-response), with a lifecycle (active / lifted / expired) MemoryItem does not have today.
B1 — Plan-vs-actual reconciliation (the smallest real step to closed-loop). A deterministic, fail-closed comparison of each prescribed plan_day against logged Activity records → a typed adherence state (done / modified / skipped / absent), exposed as a new capability/citation kind. This is the single change that turns the open loop closed; it is descriptive (verifiable against the record, so it fits the existing grounding model cleanly), and it becomes a first-class input to the next plan and to the readiness narration.
B2 — Absence-as-signal and autoregulated re-entry. When reconciliation shows a behavioral gap, branch to a detraining-aware, non-judgmental re-entry instead of a stale-data flag; consult the already-persisted ACWR/HRV to bound the re-ramp. (Connects to #25's safety envelope.)
C — A coaching-voice contract. Encode the MI/SDT non-judgmental, autonomy-supportive stance as a deterministic voice property (sibling to leads-with-state / number-cap), so the human stance is enforced, not hoped for.
Eval — the behavioral goldens the current suites structurally cannot express: an absence scenario (gap → gentle re-entry, never a guilt-trip or a stale-data shrug); a deviation scenario (athlete did X instead of the prescribed Y → plan adapts from reality, goal preserved); a stages-of-change scenario (identical metrics, hobbyist vs pro context → different coaching); a longitudinal scenario (the relationship gets better, not worse, with length). Every existing golden is single-turn and metric-shaped; none asserts a human behavior.
No PR attached by design — this issue is the deliverable, the broad architectural framing the owner asked to capture separately from #77. Checked for overlap against #10/#12/#17/#25/#39/#47/#75, ADRs 0001-0007, and .review/*_findings.md: the athlete-state model and the plan-vs-actual / absence-as-signal loop appear in none of them.
The thesis (agentic behavior + grounding)
Opened as the companion to #77 at the owner's request (see the discussion thread). #77 is the safety-shaped corner of this larger, deliberately broader issue.
The code does not yet support this, and the gap is architectural, not a missing feature. To the reasoning core, the athlete is
{canonical metric record} + {typed active goals}. The human — personality, life-phase, real availability, motivation, and above all what they actually did this week — has no first-class representation, and there is no loop that adapts to it. The result is an open-loop optimizer: it computes a correct trajectory for a perfect athlete toward a perfect goal, and never closes the loop on the real, imperfect, busy human in front of it. Two primitives are missing.Primitive A — there is no salient, durable model of the athlete-as-human
MemoryItemprose (MemoryItemKind= goal / constraint / load_response / preference / language / plan_history). It carries no structure for life-phase, real weekly availability, motivation, stress, or adherence tendency — and (per Durable safety CONSTRAINTS are recalled by keyword+recency (no salience) and are never part of the grounding fact sheet — so a contraindicated prescription ("run intervals" to a no-running knee) ships fully "grounded", and the forgetting gets monotonically worse the longer the athlete uses the product #77) it is non-salient in recall (keyword+recency,limit=8, no importance term) and explicitly non-binding: rendered as "personalization DATA the agent considers, NEVER instructions it obeys" (graph_state.py:357). So even the prose we have is forgettable seasoning, not a model.MemoryItemhas no lifecycle either (columns: kind / content / inferred / timestamps — no status, no expiry), so the system cannot represent "this constraint is active / lifted / expired," which the human model needs.Primitive B — there is no closed observe → adapt loop; absence and deviation are invisible
"gap"in the codebase means only a coverage gap — a missing computation (_gap,capabilities.py:228). A behavioral gap (the human stopped for ten days) has no representation. The readiness path treats a stale record as an MNAR / "is the connector broken?" disambiguation (engine_readiness.py,READINESS_MAX_STALENESS_DAYS) — the README's "don't mistake a broken sync for rest," which is correct but stops there. Nothing asks "life happened — how do we re-enter gently?" The "proactive" engine is a one-screen status briefing (engine_proactive.py), not a re-engagement nudge.plan_dayis an IMMUTABLE prescription (GBO-R31 ORM guards,planning.py). The one adaptation primitive —schedule_adjustment(AdjustmentType= MOVE/SWAP_WORKOUT/SHORTEN/LENGTHEN/SKIP/REST,originATHLETE/AGENT,statusPROPOSED/APPLIED) — is prospective and intentional: it edits the plan forward. There is no link from aplan_dayto theActivitythat actually fulfilled (or failed to fulfill) it, and no adherence/completion state derived from reality. The system has a steering wheel but no eyes on the road: it can propose a SKIP, but nothing observes that the athlete already skipped Tue/Wed/Thu and re-grounds from where they actually are.acwr+acwr_statusare persisted ondaily_wellness(wellness.py:126) yet never consulted by the agent (also noted in Grounding treats prescriptions as descriptions: the gate can only verify the future against 7×CTL maintenance, rewrites goal-directed targets in place (a taper week is "corrected" UPWARD), and the approval gate ships that rewritten body as the "grounded plan" #25); HRV is persisted and surfaced descriptively but never used to adjust a prescription. The readiness ingredients exist; the loop that would feed them back into adaptation does not.voice.py) — there is no model of encouragement or non-judgment, so "we adapt and keep helping, as human beings not machines" currently has nowhere to live in the architecture even if the data supported it.Why this is grounded, not soft
The temptation is to dismiss "be more human" as un-groundable mood. It is not — the content of human-centered coaching is as well-established and citable as the PMC math the project already trusts:
Proposed direction (no code in this issue)
Two primitives, each shippable in slices that stand on their own:
A1 — A structured, always-resident athlete-state model. Promote constraints + active goals into a MemGPT-style core tier present every run (this is exactly #77's Layer 0). Then extend the model beyond constraints to typed life-context fields a coach holds (real weekly availability, current life-phase / stages-of-change, motivation drivers, observed load-response), with a lifecycle (active / lifted / expired)
MemoryItemdoes not have today.B1 — Plan-vs-actual reconciliation (the smallest real step to closed-loop). A deterministic, fail-closed comparison of each prescribed
plan_dayagainst loggedActivityrecords → a typed adherence state (done / modified / skipped / absent), exposed as a new capability/citation kind. This is the single change that turns the open loop closed; it is descriptive (verifiable against the record, so it fits the existing grounding model cleanly), and it becomes a first-class input to the next plan and to the readiness narration.B2 — Absence-as-signal and autoregulated re-entry. When reconciliation shows a behavioral gap, branch to a detraining-aware, non-judgmental re-entry instead of a stale-data flag; consult the already-persisted ACWR/HRV to bound the re-ramp. (Connects to #25's safety envelope.)
C — A coaching-voice contract. Encode the MI/SDT non-judgmental, autonomy-supportive stance as a deterministic voice property (sibling to leads-with-state / number-cap), so the human stance is enforced, not hoped for.
Eval — the behavioral goldens the current suites structurally cannot express: an absence scenario (gap → gentle re-entry, never a guilt-trip or a stale-data shrug); a deviation scenario (athlete did X instead of the prescribed Y → plan adapts from reality, goal preserved); a stages-of-change scenario (identical metrics, hobbyist vs pro context → different coaching); a longitudinal scenario (the relationship gets better, not worse, with length). Every existing golden is single-turn and metric-shaped; none asserts a human behavior.
Relationship to the existing record
No PR attached by design — this issue is the deliverable, the broad architectural framing the owner asked to capture separately from #77. Checked for overlap against #10/#12/#17/#25/#39/#47/#75, ADRs 0001-0007, and
.review/*_findings.md: the athlete-state model and the plan-vs-actual / absence-as-signal loop appear in none of them.