fix(analytics): skip temperature param for Claude 4.x models by Sergey-Zeltyn · Pull Request #47 · cuga-project/cuga-eval

Sergey-Zeltyn · 2026-06-04T15:45:28Z

Bug Fix: skip temperature param for Claude 4.x models

Related Issue

Closes #41

Description

The trace comparison analytics pipeline crashed when run against aws/claude-opus-4-7 with:

litellm.BadRequestError: BedrockException — `temperature` is deprecated for this model.

Type of Changes

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix that would cause existing functionality to not work as expected)

Root Cause

LLMClient.analyze() in analytics/trace_comparison_rules/src/llm_client.py always added "temperature": self.temperature to the liteLLM completion() call. Claude 4.x models (Opus/Sonnet/Haiku) deprecated the temperature parameter entirely on Bedrock and reject any request that includes it.

litellm.drop_params = True was already set, but liteLLM's internal model registry hasn't flagged temperature as unsupported for claude-opus-4-7, so the parameter was forwarded unchanged and Bedrock rejected the request.

Solution

Add a small regex-based guard _model_supports_temperature() that matches claude-(opus|sonnet|haiku)-4[-_]* (case-insensitive) and only include temperature in completion_args when the model supports it. Behavior is unchanged for GPT, Azure, and pre-4.x Claude models (the default Azure/gpt-4.1 and the prior aws/claude-opus-4-6 continue to receive temperature as before — see PR thread for discussion on whether to extend the guard further if 4-6 also rejects it in the future).

Testing

I have tested this fix locally (./scripts/analyze.sh --analytics trace_compare --config all_trajectories.conf with aws/claude-opus-4-7 no longer crashes)
I have added tests that prove my fix works
All new and existing tests passed (ruff lint/format clean)
I have verified the bug no longer occurs

Checklist

My code follows the code style of this project
I have performed a self-review of my own code
I have made corresponding changes to the documentation if needed

Claude 4.x models (opus/sonnet/haiku) deprecated the `temperature` parameter on Bedrock and reject requests that include it. liteLLM's `drop_params=True` did not catch this for `aws/claude-opus-4-7`, causing the trace comparison pipeline to fail with: BedrockException: `temperature` is deprecated for this model. Add a small regex-based check that skips `temperature` for any `claude-(opus|sonnet|haiku)-4-*` model, leaving behavior unchanged for GPT and pre-4.x Claude models. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-04T15:45:39Z

📝 Walkthrough

Walkthrough

The LLM client module now conditionally includes the temperature parameter in LiteLLM completion requests. A Claude-4 model detector using regex pattern matching identifies models that do not support the parameter and suppresses it for those model variants. The analyze() method's completion argument construction is updated to apply this conditional logic.

Changes

Temperature Parameter Conditioning

Layer / File(s)	Summary
Claude-4 Temperature Suppression `analytics/trace_comparison_rules/src/llm_client.py`	The `re` module is imported for pattern matching. A `_CLAUDE4_RE` regex and `_model_supports_temperature()` helper are introduced to detect Claude-4 model variants and determine whether to send the temperature parameter. The `analyze()` method's completion argument building is updated to conditionally include temperature only when the helper returns true.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

[Bug]: Trace comparison analytics summary fails for Opus-4.7 #41: This PR directly addresses the "temperature is deprecated" BadRequestError from the Anthropic API for Claude Opus-4.7 by detecting Claude-4 models and conditionally omitting the temperature parameter from LiteLLM requests.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main fix: conditionally skipping the temperature parameter for Claude 4.x models, which is the core change in the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/trace-comparison-opus-temperature

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

analytics/trace_comparison_rules/src/llm_client.py (1)

25-26: 💤 Low value

Consider adding a docstring to clarify the suppression logic.

The helper function is clear but would benefit from documentation explaining why Claude 4.x models don't support temperature.

📝 Suggested docstring

 def _model_supports_temperature(model: str) -> bool:
+    """
+    Check if the model supports the temperature parameter.
+    
+    Claude 4.x models (opus, sonnet, haiku) deprecated the temperature parameter.
+    
+    Args:
+        model: Model identifier (e.g., "aws/claude-opus-4-7", "Azure/gpt-4.1")
+    
+    Returns:
+        False if model is Claude 4.x (temperature not supported), True otherwise
+    """
     return not _CLAUDE4_RE.search(model)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 25 - 26, Add
a docstring to the helper function _model_supports_temperature that explains the
suppression logic: describe that it returns False for Claude 4.x models by
checking against the _CLAUDE4_RE regex because Claude 4 models do not accept a
temperature parameter, and return True otherwise; place this docstring directly
above the _model_supports_temperature function so future readers know why
_CLAUDE4_RE is used to disable temperature for those models.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 108-110: Add automated tests covering the temperature suppression
logic in analytics/trace_comparison_rules/src/llm_client.py by creating a pytest
file (e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that
exercises the _model_supports_temperature(model) helper; include parametrized
cases for Claude 4.x variants (aws/claude-opus-4-7,
aws/claude-sonnet-4-20250131, claude-haiku-4-v1, anthropic.claude-opus-4-6)
which must return False, a set of non-4.x models that must return True
(Azure/gpt-4.1, gpt-4o-2024-08-06, claude-opus-3-5, claude-3-opus), and a
case-insensitivity check (AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure
the _model_supports_temperature matching regex and conditional logic do not
regress.

---

Nitpick comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 25-26: Add a docstring to the helper function
_model_supports_temperature that explains the suppression logic: describe that
it returns False for Claude 4.x models by checking against the _CLAUDE4_RE regex
because Claude 4 models do not accept a temperature parameter, and return True
otherwise; place this docstring directly above the _model_supports_temperature
function so future readers know why _CLAUDE4_RE is used to disable temperature
for those models.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f52deaae-f9d0-4f88-9cf5-9122410b13fe

📥 Commits

Reviewing files that changed from the base of the PR and between 1f57f77 and 5891e70.

📒 Files selected for processing (1)

analytics/trace_comparison_rules/src/llm_client.py

coderabbitai · 2026-06-04T15:49:52Z

+            if _model_supports_temperature(self.model):
+                completion_args["temperature"] = self.temperature
+


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Consider adding tests for the temperature suppression logic.

This is a critical bug fix with no automated test coverage (per PR objectives). Without tests, the fix could regress if:

The regex pattern is modified incorrectly

The conditional logic is refactored

LiteLLM behavior changes

🧪 Minimal test skeleton

Create analytics/trace_comparison_rules/src/test_llm_client.py:

import pytest from llm_client import _model_supports_temperature class TestTemperatureSupport: """Test temperature parameter suppression for Claude 4.x models.""" `@pytest.mark.parametrize`("model,expected", [ # Claude 4.x models - should NOT support temperature ("aws/claude-opus-4-7", False), ("aws/claude-sonnet-4-20250131", False), ("claude-haiku-4-v1", False), ("anthropic.claude-opus-4-6", False), # Other models - should support temperature ("Azure/gpt-4.1", True), ("gpt-4o-2024-08-06", True), ("claude-opus-3-5", True), # Claude 3.x ("claude-3-opus", True), # Case insensitive ("AWS/CLAUDE-OPUS-4-7", False), ]) def test_model_supports_temperature(self, model, expected): assert _model_supports_temperature(model) == expected

This test would have caught the original bug and prevents regression.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 108 - 110, Add automated tests covering the temperature suppression logic in analytics/trace_comparison_rules/src/llm_client.py by creating a pytest file (e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that exercises the _model_supports_temperature(model) helper; include parametrized cases for Claude 4.x variants (aws/claude-opus-4-7, aws/claude-sonnet-4-20250131, claude-haiku-4-v1, anthropic.claude-opus-4-6) which must return False, a set of non-4.x models that must return True (Azure/gpt-4.1, gpt-4o-2024-08-06, claude-opus-3-5, claude-3-opus), and a case-insensitivity check (AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure the _model_supports_temperature matching regex and conditional logic do not regress.

haroldship

I had a look at _is_reasoning_model() in cuga-agent/src/cuga/backend/llm/models.py and there are several OpenAI models that also don't have temperature. Would it make sense to add those here? And what about adding the Claude models there?

Sergey-Zeltyn · 2026-06-07T10:02:34Z

I had a look at _is_reasoning_model() in cuga-agent/src/cuga/backend/llm/models.py and there are several OpenAI models that also don't have temperature. Would it make sense to add those here? And what about adding the Claude models there?

I shall look at it later and create issue if needed

Sergey-Zeltyn requested a review from haroldship June 4, 2026 15:47

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

haroldship approved these changes Jun 7, 2026

View reviewed changes

Sergey-Zeltyn merged commit f14026e into main Jun 7, 2026
3 of 4 checks passed

Sergey-Zeltyn deleted the fix/trace-comparison-opus-temperature branch June 7, 2026 10:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(analytics): skip temperature param for Claude 4.x models#47

fix(analytics): skip temperature param for Claude 4.x models#47
Sergey-Zeltyn merged 1 commit into
mainfrom
fix/trace-comparison-opus-temperature

Sergey-Zeltyn commented Jun 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

haroldship left a comment

Uh oh!

Sergey-Zeltyn commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if _model_supports_temperature(self.model):
		completion_args["temperature"] = self.temperature

Conversation

Sergey-Zeltyn commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug Fix: skip temperature param for Claude 4.x models

Related Issue

Description

Type of Changes

Root Cause

Solution

Testing

Checklist

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

haroldship left a comment

Choose a reason for hiding this comment

Uh oh!

Sergey-Zeltyn commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sergey-Zeltyn commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading