Skip to content

Synthesize Training Intensity Distribution (TID): time-in-zone + Polarization Index — the engine knows how MUCH you train but is blind to the easy/hard MIX, the single most actionable lever in endurance science #76

@bepcyc

Description

@bepcyc

The one-line gap

WattWise measures how much you train (TSS, CTL/ATL/TSB) and how hard a single session was (intensity_class, IF, hr_load_zonal). It has no number for how your training time is split across easy / moderate / hard over a week or a block. That split — the Training Intensity Distribution (TID) — is the most-studied and most-coachable lever in endurance science, and right now the coach literally cannot answer the most common coaching question on earth: "Is my easy/hard balance right?"

Why this is the critical missing metric (not just another nice-to-have)

Training load is intensity x duration collapsed into one scalar. That collapse is the whole point of TSS — and also its blind spot. Two athletes with an identical CTL of 70 can be on opposite ends of adaptation and injury risk:

  • Athlete A: 80% easy aerobic, 20% hard intervals — polarized, the distribution that repeatedly wins for VO2max and highly-trained performance.
  • Athlete B: most sessions in the moderate "grey zone" just under threshold — the single most common mistake in amateur endurance training ("black-hole" / junk-miles training): too hard to recover from, too easy to drive top-end adaptation.

Same load. Same PMC chart. Same form. Completely different coaching verdict. Today WattWise renders A and B as indistinguishable. The PMC tells you the size of the stimulus; TID tells you its quality — and quality is where the real advice lives.

This also unlocks forecasting that works: the research is now specific about which distribution produces which adaptation (polarized: best VO2max gains, esp. in shorter blocks and trained athletes; pyramidal: most effective for many runners and base phases; ~75–80% low-intensity + ~15–20% high-intensity as the productive band). With TID measured, the coach can say "to peak your VO2max for the event in 8 weeks, your current threshold-heavy mix should shift polarized" — a real, testable prescription, instead of only describing past load.

The metric set (one logical bundle)

A rolling-window distribution, computed per sport, anchored to the athlete's existing thresholds:

  1. Time-in-zone fractions over a window (7/28-day, and per block) using the standard 3-zone physiological model:
    • Z1 — below the first threshold (LT1/VT1 / aerobic threshold)
    • Z2 — between the two thresholds ("threshold / grey zone")
    • Z3 — above the second threshold (LT2/VT2 / FTP / CP)
  2. polarization_index — Treff et al. (2019): PI = log10( (Z1 / Z2) x Z3 x 100 ), where a TID is polarized when PI > 2.00; higher = more polarized. One honest number that integrates all three zones.
  3. lit_hit_ratio — the low-intensity vs high-intensity split (the "80/20" coaches actually talk about).
  4. tid_archetype — a typed classification: polarized / pyramidal / threshold / grey_zone, with the math, not a guess. (Polarized condition, Treff: 0 <= z2 < z3 < z1.)

Why it is derivable from data we already store — today

This is not a new sensor; it is a synthesis of streams already in the record, and most of the machinery exists:

Fits the project's honesty contract

  • No anchors, no number. TID requires zone anchors (CP/FTP, or LTHR/threshold-HR, or threshold pace). When the athlete has none, return Unavailable with the typed reason — never a fabricated split.
  • Fidelity tag. A power-derived TID is raw_stream; an HR-derived one is lower fidelity; an RPE-derived one is SUBSTITUTED. Same fidelity ladder the rest of the engine already uses.
  • Groundable & citable. Each fraction / index is a deterministic function of (streams, zone anchors), so it earns a normal citation — the grounder can verify a coach claim like "68% of last month was Z1" against the record, instead of scrubbing it as an unknown metric.

Distinct from the existing record (checked)

No prior open or closed issue proposes a time-in-zone distribution, a polarization index, or a TID archetype.

Proposed slice (no code in this issue)

  1. AnalyticsService.intensity_distribution(athlete, sport, window) -> time-in-zone fractions + polarization_index + lit_hit_ratio + tid_archetype, same purity / fail-closed envelope as siblings, reusing the hr_load_zonal binning and the signature's zone anchors.
  2. MetricName members + capability resolvers + metric aliases ("intensity distribution", "easy/hard split", "polarization", "80/20") so retrieval can name them and grounding can cite them.
  3. docs/METRICS.md entries (definition, formula, honest ranges with sources, when Unavailable).
  4. Golden + property tests at the service seam (a known polarized vs grey-zone fixture must classify correctly; PI math must match Treff worked examples; fail-closed without anchors).

Selected sources


@bepcyc — flagging this as the highest-leverage metric gap I can find, and I'd value your read before anyone builds it. Two calls I'd want your opinion on: (a) scope — ship the 3-zone TID + PI first and treat per-session "black-hole" detection as a follow-up, or build both together? (b) zone model — anchor zones to CP/FTP and threshold-HR we already store (pragmatic, available today), or hold out for a proper dual-threshold (LT1/LT2) model where data allows? (Audit + proposal only — no code changed for this issue.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    futureideatodoApproved idea — revisit; PR pending or to be created

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions