feat(reasoning): per-session reasoning effort override (closes #2697)#2963
feat(reasoning): per-session reasoning effort override (closes #2697)#2963Koraji95-coder wants to merge 2 commits into
Conversation
…na#2697) Three layers: 1. Session storage: new `reasoning_effort` field on Session, persisted in the session sidecar via METADATA_FIELDS. None means inherit profile default (agent.reasoning_effort in config.yaml) — the pre-PR behaviour. 2. Resolve-at-stream-start precedence in api/streaming.py: `session.reasoning_effort or config.agent.reasoning_effort`. When a session has its own value the broker uses it; otherwise the profile default applies. CLI/profile semantics are unchanged. 3. UI surfaces: - Reasoning chip dropdown gains a two-button scope picker: "Set for this session only" / "Set as profile default" — the session-only path writes /api/session/reasoning, the profile path keeps writing /api/reasoning (CLI parity). - "Clear session override" row appears when a session has an override. - Chip carries a `.session-override` class -> italic label + small leading dot, signalling the chip is showing a session-scoped value (Q1 from the maintainer's reply). - /reasoning slash command accepts an optional `session` qualifier: `/reasoning high session` writes the session override; the plain form (`/reasoning high`) still writes the profile default. Other behaviour: - /api/session/duplicate carries the override forward so the copy behaves identically (Q2 from the maintainer's reply). - Override is persisted in the session sidecar JSON, not in-memory (Q3). - /api/session/reasoning validates against api.config.VALID_REASONING_EFFORTS and accepts null/empty to clear the override (inherit profile default). Tests: 22 new pass in tests/test_2697_per_session_reasoning_effort.py covering Session round-trip, streaming precedence, route validation, duplicate-session carry-forward, slash-command parse, dropdown markup, CSS indicator, and the clear-override flow. The existing test_reasoning_chip_btw_fixes.py "dropdowns grouped within 2 KB" check was bumped to 3 KB to accommodate the new scope picker / clear-override rows inside the same dropdown container; the structural invariant (sibling of composer-footer, not nested in composer-left) is unchanged.
…asoningChip PR nesquena#2963 / nesquena#2697 added a second `sessionOverride` argument to `_applyReasoningChip` so `tests/test_mobile_layout.py:: test_reasoning_chip_updates_desktop_and_mobile_controls` could no longer locate the function body via the literal substring `"function _applyReasoningChip(eff)"`, producing `ValueError: substring not found` before any of the six invariants (`composerReasoningWrap`, `composerMobileReasoningAction`, `composerReasoningLabel`, `composerMobileReasoningLabel`, `label.textContent=text`, `mobileLabel.textContent=text`) could be checked. Switching the slice marker to `"function _applyReasoningChip("` keeps the test signature-agnostic so future arg additions don't silently bury the actual invariants the test guards. The six contracts themselves are unchanged.
|
Pulled the worktree and read the diff against Session switch leaves the chip showing the previous session's override
function syncReasoningChip(){
if(_currentReasoningEffort===null){fetchReasoningChip();return;}
_applyReasoningChip(_currentReasoningEffort,_currentReasoningSessionOverride);
}
Compare to function syncReasoningChip(){
const sid=S&&S.session&&S.session.session_id;
if(sid){
const sessionOverride=_normalizeReasoningEffort(S.session.reasoning_effort||'');
const effective=sessionOverride||_lastProfileReasoningEffort||'';
_applyReasoningChip(effective, sessionOverride||null);
return;
}
if(_currentReasoningEffort===null){fetchReasoningChip();return;}
_applyReasoningChip(_currentReasoningEffort,null);
}This needs a small companion change: cache the profile default separately ( Scope picker default + missing session is silently wrongIn if(_reasoningWriteScope==='session'&&sid){
api('/api/session/reasoning', ...)
}else{
api('/api/reasoning', ...)A user who explicitly clicked "Set for this session only" with no session loaded will write their config.yaml default and only see a toast saying CHANGELOG missing
Smaller items
The PR is well-scoped and the test suite is solid; the chip-cache bug is the one item I'd consider a blocker, since it lands a regression on the existing single-session chip behavior the moment a user has two sessions. The rest are polish. |
Thinking Path
The CLI's
/reasoning <level>writes toagent.reasoning_effortin the active profile'sconfig.yaml. The WebUI followed the same shape: the chip and the slash command both mutated the profile default. That's the right primitive for "set my baseline" but the wrong one for "this one session needs deeper thinking, the rest don't" — switching profile-wide for a single conversation forces the user to remember to switch back.Followed the 3-layer split from the issue thread:
reasoning_effortto theSessiondataclass and toMETADATA_FIELDSso it round-trips through the sidecar JSON.Nonemeans "inherit profile default" (the pre-PR behaviour, preserved end-to-end).api/streaming.py, prefersession.reasoning_effortoveragent.reasoning_effort. The sameparse_reasoning_effort()helper parses both, so 'none'/'minimal'/.../'xhigh' route identically and unknown values silently becomeNone(no behavioural drift between the two layers)./reasoningslash command accepts an optionalsessionqualifier, and the chip carries a.session-overrideclass for the visual indicator (italic label + small leading accent dot). Maintainer's Q2 answered by carrying the override forward on/api/session/duplicate; Q3 by writing to the session sidecar (not a memory cache).The existing
/api/reasoningendpoint is untouched — it still writesconfig.yamlfor CLI parity. The new/api/session/reasoningendpoint writes only the session sidecar and validates effort againstapi.config.VALID_REASONING_EFFORTS. Null/empty effort clears the override and lets the next stream fall back to the profile default.What Changed
Backend (Python)
api/models.py—Session.reasoning_effort: Optional[str](None = inherit profile default). Added toMETADATA_FIELDSfor sidecar persistence and tocompact()so the UI can sync the chip from session state.api/streaming.py— resolve block now readss.reasoning_effortfirst and falls back to_cfg['agent']['reasoning_effort']when the session has no override. Sameparse_reasoning_effort()helper for both branches; same_reasoning_config = Nonefallback when neither is set.api/routes.py— newPOST /api/session/reasoningendpoint, validates effort against the canonical set, acceptsnull/empty to clear the override./api/session/duplicatecarriesreasoning_effortforward so the copy behaves identically.Frontend (JS/HTML/CSS)
static/index.html— dropdown gains a.reasoning-scope-group(two buttons), a divider, and a#reasoningClearOverriderow that is display-toggled by JS based on whether the current session has an override.static/ui.js—_currentReasoningSessionOverride+_reasoningWriteScopetrackers,_applyReasoningChip(eff, sessionOverride)extended to set the.session-overrideclass,_syncReasoningScopeUI()reflects active scope + clear-override row visibility, click handler routes to/api/session/reasoningor/api/reasoningbased on scope.static/commands.js—cmdReasoningparses an optionalsessionsecond token and routes the request through the session endpoint when present. Existing/reasoning <effort>and/reasoning show|hidecalls are unchanged.static/style.css—.reasoning-scope-option(+.selected), divider, clear-override row, and.composer-reasoning-chip.session-override(italic label + leading accent dot). Indicator is also applied to the mobile-config action so the override is visible from the mobile composer.Tests — new
tests/test_2697_per_session_reasoning_effort.pywith 22 tests covering all three layers (storage round-trip, streaming precedence, route validation, duplicate-session carry-forward, slash-command parse, dropdown markup, CSS indicator, clear-override flow).The existing
test_reasoning_chip_btw_fixes.py"dropdowns grouped within 2 KB" check was bumped to 3 KB because the new scope picker + clear-override rows added ~380 bytes to the same dropdown container; the structural invariant the test really enforces (sibling ofcomposer-footer, not nested insidecomposer-left's overflow-hidden) is unchanged and still asserted by the sibling test above it.Why It Matters
Two concrete pain points the issue captures:
hermesat the terminal still gets the sameagent.reasoning_effortthey always had.The override is per-session, so duplicating a session keeps the override (maintainer's Q2) and clearing it via the dropdown's clear-override row returns the session to the profile default with no
config.yamlwrite at all.Verification
pytest tests/test_2697_per_session_reasoning_effort.pyon Windows 11 against Python 3.11.--noconftestis used because the repo's session conftest tries to create a Windows symlink that requiresSeCreateSymbolicLinkPrivilege(pre-existing Windows-only issue, also affects the existingtest_reasoning_show_hide.pyand similar; unrelated to this PR).Adjacent suites re-run to confirm no regressions:
tests/test_reasoning_show_hide.py— 28 passedtests/test_reasoning_chip_btw_fixes.py— 17 passed (1 threshold bumped 2 KB -> 3 KB; same invariant)tests/test_reasoning_chip_js_behaviour.py— 11 passedtests/test_issue1103_reasoning_chip_visibility.py— 4 passed, 3 pre-existing Windows-cp1252 (master too)tests/test_465_session_branching.py,tests/test_session_duplicate.py,tests/test_issue1431_toolsets_chip_responsive.py— identical pass/fail counts on master and this branch (Windows-only env failures, unrelated).node --checkclean onstatic/commands.jsandstatic/ui.js.ruff checkon the four touched Python files: zero new findings (the 77 pre-existing issues inapi/streaming.pyare master-side).What is verified vs. what is NOT:
Risks
reasoning_effortin their JSON.Session.__init__accepts the kwarg with aNonedefault and the load path passes**datathrough, so old sidecars load withreasoning_effort=None(= inherit profile default, i.e. exactly the v0.51.137- behaviour). Once a session has been saved under this PR the field appears in the JSON; rolling back would simply see an unknown kwarg onSession.__init__(which**kwargsabsorbs today, but worth noting).sessionso a single-click on an effort writes the override (not the profile default). For a user who only ever wants the profile-write path this is one extra click. The slash command keeps the old default (/reasoning high-> profile) and the chip's behaviour is discoverable via the visible scope buttons, so the discoverability cost is low./api/session/toolsetsand/api/session/draftshapes.<2000to<3000bytes for the spread. The structural guarantee (dropdown is a sibling ofcomposer-footer, not nested incomposer-left) is still enforced by the test above it.Model Used
claude-opus-4-7 (1M context) via Claude Code, directed by a human contributor. Final review + adversarial pass + commit message + PR body authored on the contributor's machine; all
gh search prs/gh search issuespre-flight dup-checks performed before filing.AI Usage Disclosure
This PR's code, tests, and PR body were drafted with substantial AI assistance (Claude Opus 4.7, via Claude Code). A human contributor read, ran, and reviewed the diff before commit and is responsible for the contents. Three pre-flight items per the contributor's checklist:
gh search prs --repo nesquena/hermes-webui "per-session reasoning effort" --state all— no canonical fix in flight.gh search issues --repo nesquena/hermes-webui "session reasoning_effort"— Add per-session reasoning effort support in WebUI #2697 is the live issue this PR closes.