Skip to content

fix(analytics): skip temperature param for Claude 4.x models#47

Merged
Sergey-Zeltyn merged 1 commit into
mainfrom
fix/trace-comparison-opus-temperature
Jun 7, 2026
Merged

fix(analytics): skip temperature param for Claude 4.x models#47
Sergey-Zeltyn merged 1 commit into
mainfrom
fix/trace-comparison-opus-temperature

Conversation

@Sergey-Zeltyn

@Sergey-Zeltyn Sergey-Zeltyn commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Bug Fix: skip temperature param for Claude 4.x models

Related Issue

Closes #41

Description

The trace comparison analytics pipeline crashed when run against aws/claude-opus-4-7 with:

litellm.BadRequestError: BedrockException — `temperature` is deprecated for this model.

Type of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix that would cause existing functionality to not work as expected)

Root Cause

LLMClient.analyze() in analytics/trace_comparison_rules/src/llm_client.py always added "temperature": self.temperature to the liteLLM completion() call. Claude 4.x models (Opus/Sonnet/Haiku) deprecated the temperature parameter entirely on Bedrock and reject any request that includes it.

litellm.drop_params = True was already set, but liteLLM's internal model registry hasn't flagged temperature as unsupported for claude-opus-4-7, so the parameter was forwarded unchanged and Bedrock rejected the request.

Solution

Add a small regex-based guard _model_supports_temperature() that matches claude-(opus|sonnet|haiku)-4[-_]* (case-insensitive) and only include temperature in completion_args when the model supports it. Behavior is unchanged for GPT, Azure, and pre-4.x Claude models (the default Azure/gpt-4.1 and the prior aws/claude-opus-4-6 continue to receive temperature as before — see PR thread for discussion on whether to extend the guard further if 4-6 also rejects it in the future).

Testing

  • I have tested this fix locally (./scripts/analyze.sh --analytics trace_compare --config all_trajectories.conf with aws/claude-opus-4-7 no longer crashes)
  • I have added tests that prove my fix works
  • All new and existing tests passed (ruff lint/format clean)
  • I have verified the bug no longer occurs

Checklist

  • My code follows the code style of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if needed

Claude 4.x models (opus/sonnet/haiku) deprecated the `temperature`
parameter on Bedrock and reject requests that include it. liteLLM's
`drop_params=True` did not catch this for `aws/claude-opus-4-7`, causing
the trace comparison pipeline to fail with:

  BedrockException: `temperature` is deprecated for this model.

Add a small regex-based check that skips `temperature` for any
`claude-(opus|sonnet|haiku)-4-*` model, leaving behavior unchanged for
GPT and pre-4.x Claude models.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The LLM client module now conditionally includes the temperature parameter in LiteLLM completion requests. A Claude-4 model detector using regex pattern matching identifies models that do not support the parameter and suppresses it for those model variants. The analyze() method's completion argument construction is updated to apply this conditional logic.

Changes

Temperature Parameter Conditioning

Layer / File(s) Summary
Claude-4 Temperature Suppression
analytics/trace_comparison_rules/src/llm_client.py
The re module is imported for pattern matching. A _CLAUDE4_RE regex and _model_supports_temperature() helper are introduced to detect Claude-4 model variants and determine whether to send the temperature parameter. The analyze() method's completion argument building is updated to conditionally include temperature only when the helper returns true.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main fix: conditionally skipping the temperature parameter for Claude 4.x models, which is the core change in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/trace-comparison-opus-temperature

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Sergey-Zeltyn Sergey-Zeltyn requested a review from haroldship June 4, 2026 15:47

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
analytics/trace_comparison_rules/src/llm_client.py (1)

25-26: 💤 Low value

Consider adding a docstring to clarify the suppression logic.

The helper function is clear but would benefit from documentation explaining why Claude 4.x models don't support temperature.

📝 Suggested docstring
 def _model_supports_temperature(model: str) -> bool:
+    """
+    Check if the model supports the temperature parameter.
+    
+    Claude 4.x models (opus, sonnet, haiku) deprecated the temperature parameter.
+    
+    Args:
+        model: Model identifier (e.g., "aws/claude-opus-4-7", "Azure/gpt-4.1")
+    
+    Returns:
+        False if model is Claude 4.x (temperature not supported), True otherwise
+    """
     return not _CLAUDE4_RE.search(model)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 25 - 26, Add
a docstring to the helper function _model_supports_temperature that explains the
suppression logic: describe that it returns False for Claude 4.x models by
checking against the _CLAUDE4_RE regex because Claude 4 models do not accept a
temperature parameter, and return True otherwise; place this docstring directly
above the _model_supports_temperature function so future readers know why
_CLAUDE4_RE is used to disable temperature for those models.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 108-110: Add automated tests covering the temperature suppression
logic in analytics/trace_comparison_rules/src/llm_client.py by creating a pytest
file (e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that
exercises the _model_supports_temperature(model) helper; include parametrized
cases for Claude 4.x variants (aws/claude-opus-4-7,
aws/claude-sonnet-4-20250131, claude-haiku-4-v1, anthropic.claude-opus-4-6)
which must return False, a set of non-4.x models that must return True
(Azure/gpt-4.1, gpt-4o-2024-08-06, claude-opus-3-5, claude-3-opus), and a
case-insensitivity check (AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure
the _model_supports_temperature matching regex and conditional logic do not
regress.

---

Nitpick comments:
In `@analytics/trace_comparison_rules/src/llm_client.py`:
- Around line 25-26: Add a docstring to the helper function
_model_supports_temperature that explains the suppression logic: describe that
it returns False for Claude 4.x models by checking against the _CLAUDE4_RE regex
because Claude 4 models do not accept a temperature parameter, and return True
otherwise; place this docstring directly above the _model_supports_temperature
function so future readers know why _CLAUDE4_RE is used to disable temperature
for those models.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f52deaae-f9d0-4f88-9cf5-9122410b13fe

📥 Commits

Reviewing files that changed from the base of the PR and between 1f57f77 and 5891e70.

📒 Files selected for processing (1)
  • analytics/trace_comparison_rules/src/llm_client.py

Comment on lines +108 to +110
if _model_supports_temperature(self.model):
completion_args["temperature"] = self.temperature

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Consider adding tests for the temperature suppression logic.

This is a critical bug fix with no automated test coverage (per PR objectives). Without tests, the fix could regress if:

  1. The regex pattern is modified incorrectly
  2. The conditional logic is refactored
  3. LiteLLM behavior changes
🧪 Minimal test skeleton

Create analytics/trace_comparison_rules/src/test_llm_client.py:

import pytest
from llm_client import _model_supports_temperature

class TestTemperatureSupport:
    """Test temperature parameter suppression for Claude 4.x models."""
    
    `@pytest.mark.parametrize`("model,expected", [
        # Claude 4.x models - should NOT support temperature
        ("aws/claude-opus-4-7", False),
        ("aws/claude-sonnet-4-20250131", False),
        ("claude-haiku-4-v1", False),
        ("anthropic.claude-opus-4-6", False),
        
        # Other models - should support temperature  
        ("Azure/gpt-4.1", True),
        ("gpt-4o-2024-08-06", True),
        ("claude-opus-3-5", True),  # Claude 3.x
        ("claude-3-opus", True),
        
        # Case insensitive
        ("AWS/CLAUDE-OPUS-4-7", False),
    ])
    def test_model_supports_temperature(self, model, expected):
        assert _model_supports_temperature(model) == expected

This test would have caught the original bug and prevents regression.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analytics/trace_comparison_rules/src/llm_client.py` around lines 108 - 110,
Add automated tests covering the temperature suppression logic in
analytics/trace_comparison_rules/src/llm_client.py by creating a pytest file
(e.g., analytics/trace_comparison_rules/src/test_llm_client.py) that exercises
the _model_supports_temperature(model) helper; include parametrized cases for
Claude 4.x variants (aws/claude-opus-4-7, aws/claude-sonnet-4-20250131,
claude-haiku-4-v1, anthropic.claude-opus-4-6) which must return False, a set of
non-4.x models that must return True (Azure/gpt-4.1, gpt-4o-2024-08-06,
claude-opus-3-5, claude-3-opus), and a case-insensitivity check
(AWS/CLAUDE-OPUS-4-7 -> False); run pytest to ensure the
_model_supports_temperature matching regex and conditional logic do not regress.

@haroldship haroldship left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at _is_reasoning_model() in cuga-agent/src/cuga/backend/llm/models.py and there are several OpenAI models that also don't have temperature. Would it make sense to add those here? And what about adding the Claude models there?

@Sergey-Zeltyn

Copy link
Copy Markdown
Collaborator Author

I had a look at _is_reasoning_model() in cuga-agent/src/cuga/backend/llm/models.py and there are several OpenAI models that also don't have temperature. Would it make sense to add those here? And what about adding the Claude models there?

I shall look at it later and create issue if needed

@Sergey-Zeltyn Sergey-Zeltyn merged commit f14026e into main Jun 7, 2026
3 of 4 checks passed
@Sergey-Zeltyn Sergey-Zeltyn deleted the fix/trace-comparison-opus-temperature branch June 7, 2026 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Trace comparison analytics summary fails for Opus-4.7

2 participants