fix(analytics): skip temperature param for Claude 4.x models#47
Conversation
Claude 4.x models (opus/sonnet/haiku) deprecated the `temperature` parameter on Bedrock and reject requests that include it. liteLLM's `drop_params=True` did not catch this for `aws/claude-opus-4-7`, causing the trace comparison pipeline to fail with: BedrockException: `temperature` is deprecated for this model. Add a small regex-based check that skips `temperature` for any `claude-(opus|sonnet|haiku)-4-*` model, leaving behavior unchanged for GPT and pre-4.x Claude models. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe LLM client module now conditionally includes the ChangesTemperature Parameter Conditioning
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
analytics/trace_comparison_rules/src/llm_client.py (1)
25-26: 💤 Low valueConsider adding a docstring to clarify the suppression logic.
The helper function is clear but would benefit from documentation explaining why Claude 4.x models don't support temperature.
📝 Suggested docstring
def _model_supports_temperature(model: str) -> bool: + """ + Check if the model supports the temperature parameter. + + Claude 4.x models (opus, sonnet, haiku) deprecated the temperature parameter. + + Args: + model: Model identifier (e.g., "aws/claude-opus-4-7", "Azure/gpt-4.1") + + Returns: + False if model is Claude 4.x (temperature not supported), True otherwise + """ return not _CLAUDE4_RE.search(model)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 25 - 26, Add a docstring to the helper function _model_supports_temperature that explains the suppression logic: describe that it returns False for Claude 4.x models by checking against the _CLAUDE4_RE regex because Claude 4 models do not accept a temperature parameter, and return True otherwise; place this docstring directly above the _model_supports_temperature function so future readers know why _CLAUDE4_RE is used to disable temperature for those models.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 108-110: Add automated tests covering the temperature suppression
logic in analytics/trace_comparison_rules/src/llm_client.py by creating a pytest
file (e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that
exercises the _model_supports_temperature(model) helper; include parametrized
cases for Claude 4.x variants (aws/claude-opus-4-7,
aws/claude-sonnet-4-20250131, claude-haiku-4-v1, anthropic.claude-opus-4-6)
which must return False, a set of non-4.x models that must return True
(Azure/gpt-4.1, gpt-4o-2024-08-06, claude-opus-3-5, claude-3-opus), and a
case-insensitivity check (AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure
the _model_supports_temperature matching regex and conditional logic do not
regress.
---
Nitpick comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 25-26: Add a docstring to the helper function
_model_supports_temperature that explains the suppression logic: describe that
it returns False for Claude 4.x models by checking against the _CLAUDE4_RE regex
because Claude 4 models do not accept a temperature parameter, and return True
otherwise; place this docstring directly above the _model_supports_temperature
function so future readers know why _CLAUDE4_RE is used to disable temperature
for those models.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f52deaae-f9d0-4f88-9cf5-9122410b13fe
📒 Files selected for processing (1)
analytics/trace_comparison_rules/src/llm_client.py
| if _model_supports_temperature(self.model): | ||
| completion_args["temperature"] = self.temperature | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Consider adding tests for the temperature suppression logic.
This is a critical bug fix with no automated test coverage (per PR objectives). Without tests, the fix could regress if:
- The regex pattern is modified incorrectly
- The conditional logic is refactored
- LiteLLM behavior changes
🧪 Minimal test skeleton
Create analytics/trace_comparison_rules/src/test_llm_client.py:
import pytest
from llm_client import _model_supports_temperature
class TestTemperatureSupport:
"""Test temperature parameter suppression for Claude 4.x models."""
`@pytest.mark.parametrize`("model,expected", [
# Claude 4.x models - should NOT support temperature
("aws/claude-opus-4-7", False),
("aws/claude-sonnet-4-20250131", False),
("claude-haiku-4-v1", False),
("anthropic.claude-opus-4-6", False),
# Other models - should support temperature
("Azure/gpt-4.1", True),
("gpt-4o-2024-08-06", True),
("claude-opus-3-5", True), # Claude 3.x
("claude-3-opus", True),
# Case insensitive
("AWS/CLAUDE-OPUS-4-7", False),
])
def test_model_supports_temperature(self, model, expected):
assert _model_supports_temperature(model) == expectedThis test would have caught the original bug and prevents regression.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 108 - 110,
Add automated tests covering the temperature suppression logic in
analytics/trace_comparison_rules/src/llm_client.py by creating a pytest file
(e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that exercises
the _model_supports_temperature(model) helper; include parametrized cases for
Claude 4.x variants (aws/claude-opus-4-7, aws/claude-sonnet-4-20250131,
claude-haiku-4-v1, anthropic.claude-opus-4-6) which must return False, a set of
non-4.x models that must return True (Azure/gpt-4.1, gpt-4o-2024-08-06,
claude-opus-3-5, claude-3-opus), and a case-insensitivity check
(AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure the
_model_supports_temperature matching regex and conditional logic do not regress.
haroldship
left a comment
There was a problem hiding this comment.
I had a look at _is_reasoning_model() in cuga-agent/src/cuga/backend/llm/models.py and there are several OpenAI models that also don't have temperature. Would it make sense to add those here? And what about adding the Claude models there?
I shall look at it later and create issue if needed |
Bug Fix: skip temperature param for Claude 4.x models
Related Issue
Closes #41
Description
The trace comparison analytics pipeline crashed when run against
aws/claude-opus-4-7with:Type of Changes
Root Cause
LLMClient.analyze()inanalytics/trace_comparison_rules/src/llm_client.pyalways added"temperature": self.temperatureto the liteLLMcompletion()call. Claude 4.x models (Opus/Sonnet/Haiku) deprecated thetemperatureparameter entirely on Bedrock and reject any request that includes it.litellm.drop_params = Truewas already set, but liteLLM's internal model registry hasn't flaggedtemperatureas unsupported forclaude-opus-4-7, so the parameter was forwarded unchanged and Bedrock rejected the request.Solution
Add a small regex-based guard
_model_supports_temperature()that matchesclaude-(opus|sonnet|haiku)-4[-_]*(case-insensitive) and only includetemperatureincompletion_argswhen the model supports it. Behavior is unchanged for GPT, Azure, and pre-4.x Claude models (the defaultAzure/gpt-4.1and the prioraws/claude-opus-4-6continue to receivetemperatureas before — see PR thread for discussion on whether to extend the guard further if 4-6 also rejects it in the future).Testing
./scripts/analyze.sh --analytics trace_compare --config all_trajectories.confwithaws/claude-opus-4-7no longer crashes)Checklist