Add file size validation to prevent oversized uploads#97
Add file size validation to prevent oversized uploads#97Darkshadow0409 wants to merge 2 commits intoindictechcom:masterfrom
Conversation
|
I added file size validation to prevent handling oversized uploads and reduce unnecessary resource usage. This ensures better performance and improves overall reliability of the upload process. |
There was a problem hiding this comment.
Pull request overview
Adds server-side validation to reject oversized downloaded files before they are processed/uploaded, aiming to reduce wasted work and improve reliability of the upload flow.
Changes:
- Added a 100MB max file-size check after download, deleting the temp file and returning a validation error when exceeded.
- Reshaped
/api/task_status/<task_id>JSON response into a{ success, state, data, errors }structure.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| response = { | ||
| "success": True, | ||
| "task_id": task_id, | ||
| "status": task.status, | ||
| "result": task.result if task.successful() else None, | ||
| } | ||
| "state": task.status, | ||
| "data": {}, | ||
| "errors": [] | ||
| } | ||
|
|
||
| # If task failed, include error information | ||
| if task.failed(): | ||
| response["error"] = str(task.result) | ||
| if task.successful(): | ||
| response["data"] = task.result | ||
|
|
||
| elif task.failed(): | ||
| response["success"] = False | ||
| response["errors"] = [str(task.result)] |
There was a problem hiding this comment.
The /api/task_status/<task_id> response schema was changed from { task_id, status, result, error } to { success, task_id, state, data, errors }, but the frontend poller still destructures status, result, and error from this endpoint (see frontend/src/pages/Upload.js around its polling logic). This will break async uploads because the UI will never see SUCCESS/FAILURE or the task result.
To avoid a breaking change, either keep the original keys (or return both old and new keys during a transition) and/or update the frontend in the same PR to consume the new schema.
| # Add max file size validation (100MB limit) | ||
| MAX_FILE_SIZE = 100 * 1024 * 1024 # 100 MB | ||
|
|
||
| if file_size > MAX_FILE_SIZE: | ||
| if os.path.exists(file_path): | ||
| os.remove(file_path) | ||
|
|
||
| return jsonify({ | ||
| "success": False, | ||
| "data": {}, | ||
| "errors": ["File size exceeds maximum allowed limit (100MB)"] | ||
| }), 400 |
There was a problem hiding this comment.
The max-size check happens only after download_image() completes and the file is written locally. download_image() currently fetches the entire remote file into memory via r.content before writing it, so an oversized file can still cause high memory/disk usage (the main resource-risk this PR is trying to mitigate) even though later processing is blocked.
Consider enforcing the size limit during the download (e.g., streaming the response in chunks and aborting once the limit is exceeded, or rejecting early based on a Content-Length header when present).
| # Add max file size validation (100MB limit) | ||
| MAX_FILE_SIZE = 100 * 1024 * 1024 # 100 MB | ||
|
|
There was a problem hiding this comment.
MAX_FILE_SIZE is defined inside the request handler, which makes it harder to reuse/configure consistently (e.g., alongside the existing 50MB sync/async threshold) and requires editing code to change the limit. Prefer defining this as a module-level constant or in config.yaml / app.config, and referencing it from here.
| "success": False, | ||
| "data": {}, | ||
| "errors": ["File size exceeds maximum allowed limit (100MB)"] | ||
| }), 400 |
There was a problem hiding this comment.
For an oversized upload, HTTP 413 Payload Too Large is a more specific status code than 400 Bad Request, and some clients handle it specially. Consider returning 413 here while keeping the same JSON error body.
| }), 400 | |
| }), 413 |
Problem
Currently, there is no upper limit on file size before processing uploads, which can lead to unnecessary resource usage and performance issues.
Solution
This PR introduces a maximum file size validation (100MB) to prevent oversized uploads.
Changes made:
Benefits