Skip to content

fix: strip image blocks for text-only providers to prevent DeepSeek crash (#1382)#1417

Open
adityachaudhary99 wants to merge 2 commits into
Gitlawb:mainfrom
adityachaudhary99:fix/deepseek-image-fallback-1382
Open

fix: strip image blocks for text-only providers to prevent DeepSeek crash (#1382)#1417
adityachaudhary99 wants to merge 2 commits into
Gitlawb:mainfrom
adityachaudhary99:fix/deepseek-image-fallback-1382

Conversation

@adityachaudhary99
Copy link
Copy Markdown

Fixes #1382

DeepSeek (and other text-only providers) crash when receiving image_url content blocks from multi-modal conversation history. Other text-only providers (Mistral, Llama, etc.) have the same issue.

Changes

  • Added modelSupportsVision() helper that checks model name prefixes against known vision-capable models
  • Added stripImages option to the OpenAI shim convertMessages() - when enabled, image_url blocks are replaced with [Image] placeholder text
  • Applied stripImages to user messages, tool result content, and assistant content paths
  • Defaults to no stripping for providers that support vision, so existing behavior is unchanged

Add vision-support detection for OpenAI-compatible API shim. When the
resolved model does not support vision (e.g. DeepSeek, Mistral, Llama),
strip image_url content blocks and replace with a [Image] placeholder
before sending to the API. This prevents a 400 error that corrupts the
session history, since the failed message gets re-sent on every retry.

Implements option 2 from the issue: detect text-only provider and strip
the image with a warning placeholder.
Copilot AI review requested due to automatic review settings May 28, 2026 17:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds automatic image-stripping when converting messages for models that don’t support vision to avoid sending unsupported image blocks to OpenAI-shaped backends.

Changes:

  • Introduces stripImages option to convertMessages and threads it into content conversion helpers.
  • Adds modelSupportsVision() with a prefix-based allowlist and uses it to strip images based on request.resolvedModel.
  • Updates convertContentBlocks / convertToolResultContent to replace images with a text placeholder when stripping is enabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +380 to +396
const VISION_SUPPORTING_MODEL_PREFIXES = [
'gpt-4', 'gpt-4o', 'gpt-4v',
'gemini', 'gemini-',
'claude-3', 'claude-3.',
'kimi', 'kimi-',
'qwen-vl', 'qwen2-vl',
'grok-vision',
'minimax-vl',
'xiaomi-mimo-vl',
]

function modelSupportsVision(modelName: string): boolean {
const normalized = modelName.toLowerCase().trim()
return VISION_SUPPORTING_MODEL_PREFIXES.some(prefix =>
normalized.startsWith(prefix.toLowerCase()),
)
}
treatAsLocal: isLocalProviderUrl(request.baseUrl),
})
const shimConfig = runtimeShimContext.openaiShimConfig
const stripImages = !modelSupportsVision(request.resolvedModel)
Comment on lines +380 to +396
const VISION_SUPPORTING_MODEL_PREFIXES = [
'gpt-4', 'gpt-4o', 'gpt-4v',
'gemini', 'gemini-',
'claude-3', 'claude-3.',
'kimi', 'kimi-',
'qwen-vl', 'qwen2-vl',
'grok-vision',
'minimax-vl',
'xiaomi-mimo-vl',
]

function modelSupportsVision(modelName: string): boolean {
const normalized = modelName.toLowerCase().trim()
return VISION_SUPPORTING_MODEL_PREFIXES.some(prefix =>
normalized.startsWith(prefix.toLowerCase()),
)
}
Comment on lines +380 to +389
const VISION_SUPPORTING_MODEL_PREFIXES = [
'gpt-4', 'gpt-4o', 'gpt-4v',
'gemini', 'gemini-',
'claude-3', 'claude-3.',
'kimi', 'kimi-',
'qwen-vl', 'qwen2-vl',
'grok-vision',
'minimax-vl',
'xiaomi-mimo-vl',
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

image_url content blocks crash DeepSeek requests — no fallback for text-only providers

2 participants