[Bug]: Azure GPT-4.1 content filter false positives silently zero out AppWorld tasks

Several AppWorld benchmark tasks trigger Azure OpenAI's responsible-AI content
filter mid-run (severity "sexual: medium"), causing LiteLLM to raise `ContentPolicyViolationError`. The agent loop aborts and the task scores 0.0 — indistinguishable in the report from a real agent failure.

The filter is **stochastic:** same task, same agent, same model passes some runs and fails others.

Affected tasks observed (none with sexually explicit intent — false positives):
- 2e9b91e_1  "Request Denise for Venmo money for Amazon cart items"
- 98d2608_1  "Email the driving license found in my file system to my partner"
- a3ba388_1  "Schedule resignation.pdf to be sent to my manager"
- b3bdcc1_1  "Buy an air purifier on Amazon using my Visa card"

**Example:**

> Error: Error code: 400 - {'error': {'message': "litellm.BadRequestError: litellm.ContentPolicyViolationError: The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766\nmodel=Azure/gpt-4.1. content_policy_fallback=None. fallbacks=None.\n\nSet 'content_policy_fallback' - https://docs.litellm.ai/docs/routing#fallbacks. Received Model Group=Azure/gpt-4.1\nAvailable Model Group Fallbacks=None", 'type': 'invalid_request_error', 'param': None, 'code': '400', 'provider_specific_fields': {'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'low'}, 'sexual': {'filtered': True, 'severity': 'medium'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, 'inner_error': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'low'}, 'sexual': {'filtered': True, 'severity': 'medium'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, 'azure_error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'low'}, 'sexual': {'filtered': True, 'severity': 'medium'}, 'violence': {'filtered': False, 'severity': 'safe'}}}}}}}


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Azure GPT-4.1 content filter false positives silently zero out AppWorld tasks #60

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: Azure GPT-4.1 content filter false positives silently zero out AppWorld tasks #60

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions