Skip to content

1.9.x to main#140

Merged
vjbytes102 merged 3 commits into
mainfrom
1.9.x
May 2, 2026
Merged

1.9.x to main#140
vjbytes102 merged 3 commits into
mainfrom
1.9.x

Conversation

@akashbrklynhlth

@akashbrklynhlth akashbrklynhlth commented May 2, 2026

Copy link
Copy Markdown

Description

[Describe the purpose and scope of this pull request.]

Changes Made

[List the changes made in this pull request.]

Checklist

  • followed the coding style guidelines.
  • tested all changes.
  • updated the documentation to reflect these changes.
  • assigned reviewers to this pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed content type handling for Parquet file uploads. The system now correctly identifies Parquet files with the appropriate MIME type instead of using a generic binary classification, ensuring better compatibility with file processing and storage systems when uploading via presigned URLs.
  • Chores

    • Version bumped to 1.9.6.

@coderabbitai

coderabbitai Bot commented May 2, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Version bumped to 1.9.6 and explicit MIME type registration for .parquet files (application/vnd.apache.parquet) added to ensure correct content type inference when uploading parquet files via presigned URLs.

Changes

Version and MIME Type Registration

Layer / File(s) Summary
Version Update
willisapi_client/__version__.py
Version constant updated from "1.9.5" to "1.9.6".
MIME Type Registration
willisapi_client/services/metadata/upload.py
Registered .parquetapplication/vnd.apache.parquet in Python's mimetypes module at module import time to ensure mimetypes.guess_type() returns the correct content type for parquet file uploads instead of falling back to application/octet-stream.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • support for parquet #139: Also modifies willisapi_client/services/metadata/upload.py to register the .parquet MIME type; likely a duplicate or related implementation of the same fix.

Suggested reviewers

  • AbhishekAligh
  • vjbytes102

Poem

📦 A version bump and MIME type song—
Parquet files now know right from wrong.
No more octet-stream, the path is clear,
Content types precise, celebration here! 🎉

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description contains only the template placeholders with no actual content filled in; all required sections remain empty. Fill in the Description and Changes Made sections with specific details about the parquet MIME type registration and version update, and complete the checklist items.
Title check ❓ Inconclusive The title '1.9.x to main' is vague and does not clearly convey what the actual changes accomplish—it only describes the branch merge direction rather than the substantive changes. Revise the title to reflect the main changes, such as 'Add Parquet MIME type support and bump to version 1.9.6' to clearly communicate the actual work being merged.
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 1.9.x

Tip

💬 Introducing [Slack Agent](https://www.coderabbit.ai/agent): Turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value).


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@akashbrklynhlth akashbrklynhlth self-assigned this May 2, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
willisapi_client/services/metadata/upload.py (1)

58-75: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

requests.put called without a timeout — process can hang indefinitely.

Both the upload (line 65) and processed_upload (line 182) paths issue an unbounded requests.put to the S3 presigned URL. If S3 is slow or unresponsive, the calling process blocks forever with no way to recover or report an error.

Add an explicit timeout (connect + read) to both call sites:

⏱️ Proposed fix — add timeout to both S3 PUT calls
# upload() — lines ~65-75
                         response = requests.put(
                             presigned,
                             data=f,
                             headers={
                                 "x-amz-checksum-sha256": payload.get("checksum"),
                                 "x-amz-sdk-checksum-algorithm": "SHA256",
                                 "Content-Type": content_type,
                             },
+                            timeout=(10, 300),  # (connect_timeout, read_timeout) in seconds
                         )
# processed_upload() — lines ~182-190
                             response = requests.put(
                                 presigned,
                                 data=f,
                                 headers={
                                     "x-amz-checksum-sha256": checksum,
                                     "x-amz-sdk-checksum-algorithm": "SHA256",
                                     "Content-Type": content_type,
                                 },
+                                timeout=(10, 300),
                             )

Tune the read timeout (300s above) to reflect your expected maximum upload size. The connect timeout (10s) should be conservative.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@willisapi_client/services/metadata/upload.py` around lines 58 - 75, The
requests.put calls that upload files to the presigned S3 URL currently have no
timeout and can hang; update both call sites (the PUT in the upload flow around
the block that computes content_type and opens row.file_path, and the PUT in the
processed_upload path) to pass an explicit timeout tuple (e.g. timeout=(10,
300)) to requests.put so you get a connect+read timeout and can handle/report
failures via the existing response/error handling; ensure the same timeout
strategy is used for both presigned upload calls (referencing the presigned
variable and the response variable in each location).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@willisapi_client/services/metadata/upload.py`:
- Around line 58-75: The requests.put calls that upload files to the presigned
S3 URL currently have no timeout and can hang; update both call sites (the PUT
in the upload flow around the block that computes content_type and opens
row.file_path, and the PUT in the processed_upload path) to pass an explicit
timeout tuple (e.g. timeout=(10, 300)) to requests.put so you get a connect+read
timeout and can handle/report failures via the existing response/error handling;
ensure the same timeout strategy is used for both presigned upload calls
(referencing the presigned variable and the response variable in each location).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ea2d9d1b-c667-4999-8a22-283b685e29d0

📥 Commits

Reviewing files that changed from the base of the PR and between f15e1ab and 8eb7aff.

📒 Files selected for processing (2)
  • willisapi_client/__version__.py
  • willisapi_client/services/metadata/upload.py

@vjbytes102 vjbytes102 changed the title 1.9.x 1.9.x to main May 2, 2026
@vjbytes102 vjbytes102 merged commit 3740f56 into main May 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants