[codex] Refresh condenser LLM on model switch#3499
Conversation
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||||||||||||
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
PR Review: Refresh condenser LLM on model switch
This PR fixes a subtle but important bug where switching the agent's LLM mid-session would leave the context condenser pointing at the old (potentially de-provisioned) LLM credentials, causing condensation requests to fail silently while normal turns continued to work.
Overall Assessment
Risk Level: Low — The fix is well-scoped, well-tested, and follows existing patterns in the codebase. The approach is correct and the implementation is clean.
Changes Reviewed
1. openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
Import (line ~12):
The new import statement:
from openhands.sdk.context.condenser import CondenserBase, LLMSummarizingCondensershould be placed in alphabetical order with other imports from openhands.sdk. Looking at the surrounding imports, context comes after conversation alphabetically, so this placement is correct.
New method _condenser_for_switched_llm (~lines 753–763):
The method design is solid. A few observations:
- Null-safety: Correctly handles the case where
condenserisNoneby returning it directly, and also handles non-LLMSummarizingCondensercondensers by leaving them unchanged. usage_idpreservation: Correctly carries overcondenser.llm.usage_idto the new condenser LLM copy — this is important for metrics attribution and profile store lookups.reset_metrics()call: Correctly resets metrics on the new condenser LLM so that it doesn't inherit any prior token counts from the old LLM.- Pattern consistency: This follows the same
.model_copy(update={...})pattern used throughout the file for immutable state updates.
switch_llm modification (~lines 790–794):
The update now passes both llm and condenser in a single .model_copy() call, which is the correct pattern. The condenser is refreshed before being passed to model_copy, so the new condenser LLM is correctly initialized with the new credentials.
One minor note: self.state.agent is assigned after the model_copy call (self.state.agent = self.agent), so the condenser update propagates correctly there as well.
2. tests/sdk/conversation/test_switch_model.py
The test is comprehensive and well-documented. It covers:
- The primary bug: post-switch condenser LLM model/name changes
- Credential propagation (api_key)
usage_idpreservation for the condenser LLM- Metrics independence (new vs. agent LLM metrics are separate objects)
- State consistency (
conv.state.agent.condenser)
The test is deterministic (no network calls, no mocks beyond the fixture) and the assertions are precise.
Summary
| Area | Assessment |
|---|---|
| Correctness | Fix correctly propagates new LLM credentials to LLMSummarizingCondenser |
| Null-safety | Handles None and non-LLM-summarizing condensers gracefully |
| Metric isolation | reset_metrics() called on new condenser LLM; usage_id preserved correctly |
| State consistency | Both conv.agent and conv.state.agent receive the updated condenser |
| Test coverage | New test covers the bug scenario and all key assertions |
| Code style | Follows existing .model_copy() patterns; imports correctly placed |
No blocking issues. This is a clean, well-reasoned fix for a real bug. Consider merging once CI passes.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
❌ QA Report: FAIL
The condenser state is refreshed after switch_llm(), but an actual post-switch condensation still fails before making a successful LLM call.
Does this PR achieve its stated goal?
No, not end-to-end. The PR does refresh LLMSummarizingCondenser.llm to the switched model/key and preserves the condenser usage slot, which fixes the stale object state. However, when I exercised the actual condenser call after switching from a no-key LLM to a credentialed LLM, origin/main failed with the expected stale-credential LLMAuthenticationError, while this PR failed with an empty AssertionError from the LLM completion path instead of completing condensation.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and created the uv environment. |
| CI Status | 🟡 At check time: 35 successful checks, 18 skipped checks, and this qa-changes check in progress. |
| Functional Verification | ❌ State refresh passes, but actual condensation after the switch still fails on the PR branch. |
Functional Verification
Verification 1: condenser object state after model switch
Step 1 — Reproduce baseline without the fix:
Ran git switch --detach origin/main; OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_condenser_switch.py:
active_model=litellm_proxy/new-model
active_usage_id=agent-new
active_key=new-test-key
condenser_model=litellm_proxy/old-model
condenser_usage_id=condenser-slot
condenser_key=None
state_condenser_model=litellm_proxy/old-model
metrics_shared_with_agent=False
check_active_model=PASS
check_condenser_model_refreshed=FAIL
check_condenser_key_refreshed=FAIL
check_condenser_usage_preserved=PASS
check_state_agent_updated=PASS
check_metrics_independent=PASS
RESULT: FAIL - condenser still has stale model/credentials or shared metrics after switch
baseline_exit=2
This confirms the original bug: the active agent switches to the new model/key, but the condenser remains on the old no-key LLM.
Step 2 — Apply the PR's changes:
Checked out codex/refresh-condenser-llm-switch at e4604c4fc4ba6f9796427380e40e941baecea25a.
Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_condenser_switch.py:
active_model=litellm_proxy/new-model
active_usage_id=agent-new
active_key=new-test-key
condenser_model=litellm_proxy/new-model
condenser_usage_id=condenser-slot
condenser_key=new-test-key
state_condenser_model=litellm_proxy/new-model
metrics_shared_with_agent=False
check_active_model=PASS
check_condenser_model_refreshed=PASS
check_condenser_key_refreshed=PASS
check_condenser_usage_preserved=PASS
check_state_agent_updated=PASS
check_metrics_independent=PASS
RESULT: PASS - condenser follows switched LLM while preserving its usage slot and metrics independence
pr_exit=0
This shows the PR updates the condenser's model/key and preserves independent usage/metrics state.
Verification 2: actual condenser call after switching to a credentialed LLM
Step 1 — Reproduce baseline without the fix:
Ran git switch --detach origin/main; OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_condenser_real_call.py using the configured LLM_MODEL, LLM_BASE_URL, and LLM_API_KEY:
condensation_error_type=LLMAuthenticationError
condensation_error=litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - The api_key client option must be set either by passing api_key to the client or by setting the LITELLM_PROXY_API_KEY environment variable
baseline_real_exit=2
This reproduces the user-visible failure mode: after the switch, condensation still uses the stale no-key condenser LLM.
Step 2 — Apply the PR's changes:
Checked out codex/refresh-condenser-llm-switch again and re-ran the same condensation path after switch_llm() and _ensure_agent_ready().
Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_condenser_real_call.py:
condensation_error_type=AssertionError
condensation_error=<empty>
pr_real_exit=2
This shows the stale-credential error is gone, but the actual condensation still does not complete on the PR branch.
Issues Found
- 🟠 Issue: Actual post-switch condensation still fails on the PR branch with an empty
AssertionError, so the fix does not yet deliver successful condensation after switching from a no-key LLM to a credentialed LLM.
This QA review was created by an AI agent (OpenHands) on behalf of the requester.
Final verdict: FAIL
|
@OpenHands /iterate |
|
I'm on it! neubig can track my progress at all-hands.dev |
|
@OpenHands /iterate |
|
I'm on it! neubig can track my progress at all-hands.dev |
|
@OpenHands please continue iterating: address or resolve the remaining review threads so the unresolved-review-threads check passes. |
|
I'm on it! neubig can track my progress at all-hands.dev |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. |
all-hands-bot
left a comment
There was a problem hiding this comment.
PR Review: Refresh condenser LLM on model switch
🟡 Acceptable — The fix is correct and well-tested. Minor improvement suggested.
Summary
This PR fixes a subtle but important bug where switching the agent's LLM mid-session via switch_llm() would leave the context condenser pointing at the old LLM. Since the condenser owns a separate copy of the agent LLM, condensation requests after a model switch would still call the old (potentially credential-less) model.
The fix creates a fresh condenser LLM copy from the newly switched agent LLM, ensuring both the agent and its condenser use consistent credentials.
Analysis
[IMPROVEMENT OPPORTUNITIES]
- [openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py, Line 752] Simplification: The
isinstancecheck on every switch could be avoided by having the condenser beNonewhen no LLM summarization is configured. However, this is a larger refactor and the current approach is pragmatic.
Tests
The test coverage is comprehensive:
test_switch_llm_refreshes_llm_condenser_credentials: Verifies the condenser LLM gets the new model's credentials and can make actual completion callstest_switch_llm_condenser_can_generate_condensation: End-to-end test that verifies the condensation actually works after a model switchtest_switch_llm_then_send_messageandtest_switch_between_two_llms: Regression coverage for existing behavior
[TESTING GAPS]
None — the tests cover both the state transition and the actual functionality.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW - The change is localized to the
switch_llmmethod andreset_metrics - The
LLMSummarizingCondensercase is a clear, isolated fix - Tests exercise both the state changes and real LLM calls
[VERDICT]
✅ Worth merging: Core logic is sound, tests are comprehensive.
KEY INSIGHT:
The reset_metrics change from lazy (_metrics = None) to eager (create fresh Metrics/Telemetry immediately) is the correct fix — it ensures copied LLMs are immediately usable without requiring callers to access .metrics first.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The condenser LLM refresh path was verified with a real SDK conversation flow before and after the PR, and the PR fixes the stale-credential condensation failure.
Does this PR achieve its stated goal?
Yes. On origin/main, switching the active agent LLM left the LLMSummarizingCondenser on openai/old-model with no API key/base URL, so condensation failed with LLMAuthenticationError and never reached the configured local LLM endpoint. On the PR branch, the same SDK flow updated the condenser to openai/new-model, preserved usage_id=condenser, kept metrics independent, and successfully generated a condensation through the local OpenAI-compatible endpoint using Authorization: Bearer qa-new-key.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv-managed workspace packages. |
| CI Status | |
| Functional Verification | ✅ LocalConversation.switch_llm() + LLMSummarizingCondenser.get_condensation() worked after the PR and failed in the expected stale-condenser way on base. |
Functional Verification
Test 1: Condenser LLM follows a mid-conversation model switch
Step 1 — Reproduce / establish baseline without the fix:
Checked out origin/main at 16ad9e13 and ran uv run python /tmp/qa_switch_condenser.py. The script creates a LocalConversation, switches from an initial no-key LLM to a credentialed LLM pointing at a local OpenAI-compatible HTTP endpoint, then asks the LLMSummarizingCondenser to condense real MessageEvent history.
Relevant output:
agent_model=openai/new-model
agent_base_url=http://127.0.0.1:41717
condenser_model=openai/old-model
condenser_usage_id=condenser
condenser_base_url=None
condenser_key=None
metrics_shared=False
condensation_status=ERROR:LLMAuthenticationError:litellm.AuthenticationError: AuthenticationError: OpenAIException - The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
server_requests=0
This confirms the bug: the agent switched to the new LLM, but the condenser stayed on the old no-credential LLM and failed before it could call the available endpoint.
Step 2 — Apply the PR's changes:
Checked out codex/refresh-condenser-llm-switch at 845dfd5e50c7376fc4fced33291598a0633b4256.
Step 3 — Re-run with the fix in place:
Ran the same uv run python /tmp/qa_switch_condenser.py command.
Relevant output:
agent_model=openai/new-model
agent_base_url=http://127.0.0.1:58843
condenser_model=openai/new-model
condenser_usage_id=condenser
condenser_base_url=http://127.0.0.1:58843
condenser_key=qa-new-key
metrics_shared=False
condensation_status=OK
condensation_summary=QA condensed summary from local endpoint
forgotten_events=10
server_requests=1
request={"authorization": "Bearer qa-new-key", "model": "new-model", "path": "/chat/completions"}
This shows the fix works: the condenser now uses the switched model and credentials, preserves the condenser usage slot, does not share metrics with the agent LLM, and successfully performs condensation through the configured endpoint.
Issues Found
None.
This review was generated by an AI agent (OpenHands) on behalf of the user.
enyst
left a comment
There was a problem hiding this comment.
This is a bit tricky issue. From the PR description, I understand that the condenser LLM is not itself, it would “follow” the main LLM, and I’m not sure that’s the way it should behave.
Condenser is a separate profile, as there can be others, e.g. an auxiliary LLM, an ‘oracle’ LLM. If I switch the main LLM, I switch the main LLM, not the condenser, oracle, auxiliary, or other profiles set in action.
I can see the opposite case too! I think it would be more clear if condenser started with a different profile or in the UI we offer the ability to set a different profile for it?
|
Addressed the model-switch semantics concern in 622d415. The condenser now follows switch_llm only when it is the default condenser derived from the current agent LLM. If the condenser is configured as an independent profile, switch_llm preserves that condenser LLM unchanged. Added regression coverage for both paths. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: PR #3499 - [codex] Refresh condenser LLM on model switch
Taste Rating: 🟡 Acceptable — Works but could be cleaner
[CRITICAL ISSUES]
None identified. The core logic is sound.
[IMPROVEMENT OPPORTUNITIES]
-
[local_conversation.py, line 756-758] Configuration comparison could be clearer:
The logic for detecting when to refresh the condenser LLM depends on comparingcondenser_config(LLM fields from condenser) againstcurrent_config(full LLM config from agent). This works, but the intent is subtle. Consider extracting this into a named method like_llm_configs_match(llm1, llm2)to self-document the purpose. -
[llm.py, line 703] Redundant docstring update:
The docstring now says "creating fresh metrics and telemetry immediately" but the method name is stillreset_metrics. If the behavior changed from lazy to eager initialization, consider whetherreset_metricsstill accurately describes the operation, or if a different name would be more appropriate.
[TESTING GAPS]
None. The test coverage is comprehensive:
- Tests the primary use case (condenser refresh on LLM switch)
- Tests that the condenser can actually generate a condensation after switch
- Tests that independent condenser configurations are preserved
- Tests that the new
reset_metricsbehavior works correctly
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
The change is well-contained: it only affects the switch_llm code path and adds appropriate tests. The logic handles both the default case (condenser uses same LLM as agent) and the independent case (condenser configured separately) correctly. No breaking changes to existing behavior.
VERDICT
✅ Worth merging: Core logic is sound, addresses a real bug (condenser not being refreshed on LLM switch), and has good test coverage. Minor naming/docstring improvements suggested but not blocking.
KEY INSIGHT
The fix elegantly handles the case where the condenser shares credentials with the agent LLM by creating a fresh copy of the condenser with the new LLM, while preserving independent condenser configurations unchanged — avoiding the scenario where a mid-session model switch causes condensation requests to fail with stale credentials.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the model-switch scenario end-to-end: the base branch leaves the condenser on stale no-key credentials and fails during condensation, while this PR refreshes the condenser LLM and real condensation succeeds.
Does this PR achieve its stated goal?
Yes. The PR set out to refresh the LLMSummarizingCondenser LLM copy when LocalConversation.switch_llm() changes the active agent LLM, preserving the condenser usage slot and independent metrics. I reproduced the stale-condenser failure on origin/main, then reran the same SDK workflow on commit 622d415d9b3807f7b6282cc836e2aca413877a1e; the condenser moved to the credentialed model, kept usage_id=condenser, used independent metrics, and successfully generated a condensation summary.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run python synced the project environment and imported the SDK successfully. |
| CI Status | |
| Functional Verification | ✅ Real SDK conversation switch + condenser condensation works on the PR branch. |
Functional Verification
Test 1: Switching from no-key initial LLM to credentialed LLM refreshes the condenser
Step 1 — Reproduce baseline without the fix:
Checked out origin/main (16ad9e13) and ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_switch_condenser_functional.pyObserved:
agent_model_after_switch litellm_proxy/openai/gpt-5.5
condenser_model_after_switch litellm_proxy/old-no-key-model
condenser_has_api_key False
condenser_usage_id condenser
condenser_metrics_independent True
condensation_status ERROR
error_type LLMAuthenticationError
error_excerpt litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - The api_key client option must be set either by passing api_key to the client or by setting the LITELLM_PROXY_API_KEY environment variable
This confirms the reported bug: after switching the conversation to a credentialed LLM, the condenser still used the old no-key LLM and failed when condensation ran.
Step 2 — Apply the PR's changes:
Checked out codex/refresh-condenser-llm-switch at 622d415d9b3807f7b6282cc836e2aca413877a1e.
Step 3 — Re-run with the fix in place:
Ran the same command:
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_switch_condenser_functional.pyObserved:
agent_model_after_switch litellm_proxy/openai/gpt-5.5
condenser_model_after_switch litellm_proxy/openai/gpt-5.5
condenser_has_api_key True
condenser_usage_id condenser
condenser_metrics_independent True
condensation_status OK
summary_excerpt USER_CONTEXT: User provided 10 items to remember. COMPLETED: Recorded items 1 through 10. PENDING: None. CURRENT_STATE: Remembered items: 1. item 1 2. item 2
forgotten_event_count 10
This shows the fix works: the condenser was refreshed to the switched model with credentials, preserved its condenser usage slot, did not share the agent metrics object, and completed a real condensation call.
Test 2: Independently configured condenser profile is not overwritten
On the PR branch, ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_switch_condenser_independent.pyObserved:
agent_model_after_switch litellm_proxy/openai/gpt-5.5
condenser_model_after_switch litellm_proxy/condenser-profile
condenser_key_preserved True
condenser_usage_id condenser
This confirms a separately configured condenser LLM remains independent when the agent LLM is switched.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
|
Requested re-review after 622d415. The condenser now follows switch_llm only for the default condenser derived from the current agent LLM; independently configured condenser profiles stay unchanged. |
Summary
LLMSummarizingCondenserLLM copy whenLocalConversation.switch_llm()changes the active agent LLM.Root Cause
The default condenser owns a separate LLM copy created from the initial agent LLM. A mid-conversation model/profile switch updated
agent.llm, but leftagent.condenser.llmpointing at the old copy. Normal turns could continue on the new model, then later condensation would call the stale no-credential LLM and fail with a missing-credentials error.Validation
uv run pytest tests/sdk/conversation/test_switch_model.pyuv run pytest tests/agent_server/test_conversation_router.py -k switch_conversation_llmuv run ruff check openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py tests/sdk/conversation/test_switch_model.pyAgent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:622d415-pythonRun
All tags pushed for this build
About Multi-Architecture Support
622d415-python) is a multi-arch manifest supporting both amd64 and arm64622d415-python-amd64) are also available if neededCloses #3508