Skip to content

fix: raise ValueError for unsupported MIME types in file_data URI path#4

Open
Raman369AI wants to merge 3 commits intomainfrom
fix/file-uri-unknown-mime-type-error
Open

fix: raise ValueError for unsupported MIME types in file_data URI path#4
Raman369AI wants to merge 3 commits intomainfrom
fix/file-uri-unknown-mime-type-error

Conversation

@Raman369AI
Copy link
Copy Markdown
Owner

Relates to google#5022 and upstream PR google#5023

Problem

When a Part with file_data.file_uri has no determinable MIME type, the library fell back to application/octet-stream via _DEFAULT_MIME_TYPE. This value propagated to LiteLLM which raised a cryptic internal ValueError with no guidance for the user.

The same failure occurred when the caller explicitly set mime_type = "application/octet-stream" on the Part. Both cases reach the same failure point.

There was also an inconsistency between the two content paths:

  • The inline_data path raises ValueError immediately for unsupported MIME types
  • The file_data path silently used a fallback and failed later with a cryptic message

GcsArtifactService generates URIs like gs://bucket/artifact/0 with no extension and no MIME type, making ADK's own artifact system the primary trigger for this fallback.

Fix

Removes _DEFAULT_MIME_TYPE and raises ValueError early with an actionable message when the resolved MIME type is either unknown or application/octet-stream. This aligns the file_data path with the existing fail-fast behavior of the inline_data path.

The logic order is also corrected so providers that always produce a text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media URLs are handled before the MIME type guard, keeping those paths unaffected.

Changes

  • src/google/adk/models/lite_llm.py: remove _DEFAULT_MIME_TYPE, restructure file_uri handling block, raise ValueError for missing or generic MIME types
  • tests/unittests/models/test_litellm.py: update two existing tests to assert the new ValueError, add one new test covering explicit application/octet-stream

Testing

pytest tests/unittests/models/test_litellm.py
241 passed, 5 errors (pre-existing, missing pytest-mock fixture)

Format verified with pyink. mypy comm -13 simulation: zero new errors.

The file_data.file_uri path silently fell back to application/octet-stream
when no MIME type could be determined, then passed it to LiteLLM which raised
a cryptic internal ValueError. The inline_data path already had fail-fast
behavior for unsupported types but the file_data path did not.

This change removes the _DEFAULT_MIME_TYPE fallback and raises ValueError early
with an actionable message for two cases: when no MIME type can be determined
from the URI, display_name, or explicit field, and when the resolved type is
application/octet-stream regardless of whether it was set by the caller or
arrived via a library default. Both cases cause the same downstream failure.

The logic order is also corrected so that providers which always produce a
text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media
URLs are handled before the MIME type guard, keeping those paths unaffected.

Tests are updated to assert the new ValueError and a new test covers the
explicit application/octet-stream case.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the ADK to version 1.28.0, introducing significant security, concurrency, and stability enhancements. Key improvements include origin-check middleware for HTTP and WebSockets, strict agent name validation to prevent arbitrary imports, and the restriction of builder endpoints to web-enabled environments. The update also resolves concurrency races in session state management, adds regional support for Discovery Engine, ensures correct Pydantic required field parsing, provides consistent function call ID generation, and pins the LiteLLM dependency to a secure version. Feedback was provided regarding the risk of silent data loss when configuring models to ignore extra fields, suggesting a more explicit approach to unknown data.


model_config = ConfigDict(
extra='forbid',
extra='ignore',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Changing extra='forbid' to extra='ignore' in model_config can lead to silent data loss if external data sources introduce new fields that are not explicitly handled by the model. While it can improve compatibility, it's important to be aware of the trade-offs. Consider if extra='allow' or a more explicit handling of unknown fields would be more appropriate depending on the expected data evolution.

Raman369AI and others added 2 commits April 24, 2026 23:57
Adds test_content_to_message_param_user_message_file_uri_explicit_octet_stream
to confirm that an upstream caller passing mime_type='application/octet-stream'
raises a clear ValueError, covering both branches of the combined guard.

Fixes: google#5022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant