fix: raise ValueError for unsupported MIME types in file_data URI path#4
fix: raise ValueError for unsupported MIME types in file_data URI path#4Raman369AI wants to merge 3 commits intomainfrom
Conversation
The file_data.file_uri path silently fell back to application/octet-stream when no MIME type could be determined, then passed it to LiteLLM which raised a cryptic internal ValueError. The inline_data path already had fail-fast behavior for unsupported types but the file_data path did not. This change removes the _DEFAULT_MIME_TYPE fallback and raises ValueError early with an actionable message for two cases: when no MIME type can be determined from the URI, display_name, or explicit field, and when the resolved type is application/octet-stream regardless of whether it was set by the caller or arrived via a library default. Both cases cause the same downstream failure. The logic order is also corrected so that providers which always produce a text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media URLs are handled before the MIME type guard, keeping those paths unaffected. Tests are updated to assert the new ValueError and a new test covers the explicit application/octet-stream case.
There was a problem hiding this comment.
Code Review
This pull request updates the ADK to version 1.28.0, introducing significant security, concurrency, and stability enhancements. Key improvements include origin-check middleware for HTTP and WebSockets, strict agent name validation to prevent arbitrary imports, and the restriction of builder endpoints to web-enabled environments. The update also resolves concurrency races in session state management, adds regional support for Discovery Engine, ensures correct Pydantic required field parsing, provides consistent function call ID generation, and pins the LiteLLM dependency to a secure version. Feedback was provided regarding the risk of silent data loss when configuring models to ignore extra fields, suggesting a more explicit approach to unknown data.
|
|
||
| model_config = ConfigDict( | ||
| extra='forbid', | ||
| extra='ignore', |
There was a problem hiding this comment.
Changing extra='forbid' to extra='ignore' in model_config can lead to silent data loss if external data sources introduce new fields that are not explicitly handled by the model. While it can improve compatibility, it's important to be aware of the trade-offs. Consider if extra='allow' or a more explicit handling of unknown fields would be more appropriate depending on the expected data evolution.
Adds test_content_to_message_param_user_message_file_uri_explicit_octet_stream to confirm that an upstream caller passing mime_type='application/octet-stream' raises a clear ValueError, covering both branches of the combined guard. Fixes: google#5022
Relates to google#5022 and upstream PR google#5023
Problem
When a
Partwithfile_data.file_urihas no determinable MIME type, the library fell back toapplication/octet-streamvia_DEFAULT_MIME_TYPE. This value propagated to LiteLLM which raised a cryptic internalValueErrorwith no guidance for the user.The same failure occurred when the caller explicitly set
mime_type = "application/octet-stream"on the Part. Both cases reach the same failure point.There was also an inconsistency between the two content paths:
inline_datapath raisesValueErrorimmediately for unsupported MIME typesfile_datapath silently used a fallback and failed later with a cryptic messageGcsArtifactServicegenerates URIs likegs://bucket/artifact/0with no extension and no MIME type, making ADK's own artifact system the primary trigger for this fallback.Fix
Removes
_DEFAULT_MIME_TYPEand raisesValueErrorearly with an actionable message when the resolved MIME type is either unknown orapplication/octet-stream. This aligns thefile_datapath with the existing fail-fast behavior of theinline_datapath.The logic order is also corrected so providers that always produce a text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media URLs are handled before the MIME type guard, keeping those paths unaffected.
Changes
src/google/adk/models/lite_llm.py: remove_DEFAULT_MIME_TYPE, restructure file_uri handling block, raiseValueErrorfor missing or generic MIME typestests/unittests/models/test_litellm.py: update two existing tests to assert the newValueError, add one new test covering explicitapplication/octet-streamTesting
Format verified with
pyink. mypycomm -13simulation: zero new errors.