fix: strip image blocks for text-only providers to prevent DeepSeek crash (#1382)#1417
Open
adityachaudhary99 wants to merge 2 commits into
Open
fix: strip image blocks for text-only providers to prevent DeepSeek crash (#1382)#1417adityachaudhary99 wants to merge 2 commits into
adityachaudhary99 wants to merge 2 commits into
Conversation
Add vision-support detection for OpenAI-compatible API shim. When the resolved model does not support vision (e.g. DeepSeek, Mistral, Llama), strip image_url content blocks and replace with a [Image] placeholder before sending to the API. This prevents a 400 error that corrupts the session history, since the failed message gets re-sent on every retry. Implements option 2 from the issue: detect text-only provider and strip the image with a warning placeholder.
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds automatic image-stripping when converting messages for models that don’t support vision to avoid sending unsupported image blocks to OpenAI-shaped backends.
Changes:
- Introduces
stripImagesoption toconvertMessagesand threads it into content conversion helpers. - Adds
modelSupportsVision()with a prefix-based allowlist and uses it to strip images based onrequest.resolvedModel. - Updates
convertContentBlocks/convertToolResultContentto replace images with a text placeholder when stripping is enabled.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+380
to
+396
| const VISION_SUPPORTING_MODEL_PREFIXES = [ | ||
| 'gpt-4', 'gpt-4o', 'gpt-4v', | ||
| 'gemini', 'gemini-', | ||
| 'claude-3', 'claude-3.', | ||
| 'kimi', 'kimi-', | ||
| 'qwen-vl', 'qwen2-vl', | ||
| 'grok-vision', | ||
| 'minimax-vl', | ||
| 'xiaomi-mimo-vl', | ||
| ] | ||
|
|
||
| function modelSupportsVision(modelName: string): boolean { | ||
| const normalized = modelName.toLowerCase().trim() | ||
| return VISION_SUPPORTING_MODEL_PREFIXES.some(prefix => | ||
| normalized.startsWith(prefix.toLowerCase()), | ||
| ) | ||
| } |
| treatAsLocal: isLocalProviderUrl(request.baseUrl), | ||
| }) | ||
| const shimConfig = runtimeShimContext.openaiShimConfig | ||
| const stripImages = !modelSupportsVision(request.resolvedModel) |
Comment on lines
+380
to
+396
| const VISION_SUPPORTING_MODEL_PREFIXES = [ | ||
| 'gpt-4', 'gpt-4o', 'gpt-4v', | ||
| 'gemini', 'gemini-', | ||
| 'claude-3', 'claude-3.', | ||
| 'kimi', 'kimi-', | ||
| 'qwen-vl', 'qwen2-vl', | ||
| 'grok-vision', | ||
| 'minimax-vl', | ||
| 'xiaomi-mimo-vl', | ||
| ] | ||
|
|
||
| function modelSupportsVision(modelName: string): boolean { | ||
| const normalized = modelName.toLowerCase().trim() | ||
| return VISION_SUPPORTING_MODEL_PREFIXES.some(prefix => | ||
| normalized.startsWith(prefix.toLowerCase()), | ||
| ) | ||
| } |
Comment on lines
+380
to
+389
| const VISION_SUPPORTING_MODEL_PREFIXES = [ | ||
| 'gpt-4', 'gpt-4o', 'gpt-4v', | ||
| 'gemini', 'gemini-', | ||
| 'claude-3', 'claude-3.', | ||
| 'kimi', 'kimi-', | ||
| 'qwen-vl', 'qwen2-vl', | ||
| 'grok-vision', | ||
| 'minimax-vl', | ||
| 'xiaomi-mimo-vl', | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1382
DeepSeek (and other text-only providers) crash when receiving image_url content blocks from multi-modal conversation history. Other text-only providers (Mistral, Llama, etc.) have the same issue.
Changes