Skip to content

Rename --reasoning → --reasoning-packs (disambiguate pack-set vs thinking-mode) #65

@noonghunna

Description

@noonghunna

Problem

The run subcommand has two conceptually-different axes that currently share confusable naming:

  • Pack-set selectors (the what): --quick / --medium / --full / --reasoning
  • Thinking-mode flags (the how): --enable-thinking / --no-thinking

--reasoning is a pack-set selector (it runs the reasoning-heavy suite — HE+, LCB v6, GPQA, GSM-Symbolic), but its name reads like a mode ("evaluate with reasoning"), and it sits right next to --enable-thinking/--no-thinking. In practice this is genuinely ambiguous: when you want "with-reasoning vs without-reasoning" you actually want --full --enable-thinking vs --full --no-thinkingnot --reasoning. The shared word makes that easy to get wrong.

Proposal

Rename the flag to make it clearly a pack category, parallel to --full:

  • Add --reasoning-packs as the primary flag.
  • Keep --reasoning as a hidden, deprecated alias (back-compat — anything scripted against it keeps working), with a one-line deprecation note in --help.
  • Update --help so the pack-set group reads --quick | --medium | --full | --reasoning-packs and stays visually separate from the --enable-thinking/--no-thinking mode group.

This keeps the mental model crisp: pick the packs, then pick the mode. The canonical quality eval stays --full --no-thinking + --full --enable-thinking; --reasoning-packs is the separate, deliberate deep-dive.

Downstream: the quality-test.sh wrapper (in noonghunna/club-3090) forwards the same token, so mirror the rename there + its usage block.

Optional (bigger, later)

If we ever want the split to be structural rather than by-convention, make the axes explicit: --packs {quick,medium,full,reasoning} + --thinking {on,off}. That's a breaking change though — the --reasoning → --reasoning-packs alias rename gets most of the clarity for a fraction of the churn, so start there.

Context

Hit this while running a KV-cache quality A/B: reached for --medium + --sandboxed-only (which, separately, means all sandboxed packs incl. the slow HE+/LCB) when the right call was simply --full in both thinking modes. The naming nudged toward the wrong flag.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions