Skip to content

Syllable-calibrated runtime estimator #181

@stultus

Description

@stultus

What

Replace the current page-equals-minute runtime heuristic with a syllable-calibrated estimator that uses per-language speech rates.

Why this matters

The page-equals-minute rule is an Anglo-Hollywood convention (one Courier-formatted page of English screenplay ≈ 60 seconds of screen time). It systematically under- or over-estimates Malayalam runtimes because Malayalam syllable density per visual line is different from English.

A syllable-calibrated estimator gives a far more accurate prediction:

  • Count syllables per dialogue line via mlphon.
  • Apply per-language base rate (Malayalam ≈ 5.5–6.5 syl/sec for normal delivery; English ≈ 4–5 syl/sec).
  • Sum syllables across dialogue, divide by rate → dialogue runtime.
  • Add action-line runtime via existing page-eighths math (action is paced page-equals-minute reasonably).
  • Total = dialogue runtime + action runtime.

Output: "117 minutes — 49 min dialogue at 5.8 syl/sec, 68 min action at standard pacing."

Why not Tier 2 #5 (naturality scoring)

Skipped per the planning discussion — the Malayalam Speech Corpus isn't dense enough yet to derive per-genre rates with confidence. This issue uses a literature-derived constant (5.8 syl/sec for spoken Malayalam, citation needed in the doc) as the v1 default. Per-character calibration via the Statistics tab can come later if the corpus grows.

Dependency

mlphon's syllable counter. Or, simpler v1: a self-contained Rust syllabifier following Malayalam syllable rules (consonant + dependent vowel = one syllable; consonant cluster + vowel = one syllable; etc. — a few hundred lines).

Technical sketch

  • Add a syllableCount(text: string) Rust helper.
  • Walk dialogue blocks in computeStats, sum syllables.
  • Configurable rate constant via Settings → Writing.
  • Surface in the Overview tab as the new "Estimated screen time" replacing pageCount-as-minute.

Out of scope

Per-character speech rate tuning, dialect adjustments, accent modeling — all are Tier 2 follow-ups.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions