Skip to content

Add a length/token-limit failure_mode (finish_reason == "length") to separate overthink-truncation from verifier_fail #61

@noonghunna

Description

@noonghunna

Problem

A model that overthinks or gets into a repetition loop until it hits max_tokens currently scores as a plain verifier_failindistinguishable from a genuinely wrong answer. There is no finish_reason == "length" detection in the runner today, so "the model ran out of budget mid-thought" and "the model produced a confidently wrong answer" land in the same bucket.

This surfaced in club-3090 discussions/241: a known failure mode of Qwen3.6-35B-A3B is that it "tends to get into a loop or overthink," and one quant's run got stuck + took ~2× the runtime of its siblings — but the breakdown couldn't tell us whether that was the model looping (truncated at the token cap) or the harness hanging.

Proposal

  • Detect finish_reason == "length" (and optionally a cheap repetition heuristic) and classify it as a distinct failure_mode — e.g. token_limit / output_truncated — instead of folding it into verifier_fail.
  • Add it to the canonical failure-mode list (benchlocal_cli/types.py) and the end-of-run Failure breakdown: + saved JSON, so it shows up in inspect --mode token_limit.

Why it matters

Per-scenario latency_seconds already flags the slow outliers, but a first-class class makes looped vs. hung vs. genuinely wrong legible at a glance:

  • token_limit → model truncated (raise the budget, or it's looping)
  • agent_runner_timeout / agent_runner_crashed → sandboxed-agent path (already exists)
  • verifier_fail → ran to completion, answer wrong

Notes

  • The agentic packs already separate agent_runner_timeout; this is the gap on the non-agentic completion path.
  • Pairs naturally with the per-scenario tokens already captured (tokens_completion) — a near-max_tokens completion + finish_reason == "length" is the strong signal.

Reported by @laurimyllari in club-3090 discussions/241.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions