Skip to content

Add file size validation to prevent oversized uploads#97

Open
Darkshadow0409 wants to merge 2 commits intoindictechcom:masterfrom
Darkshadow0409:pr4-file-validation
Open

Add file size validation to prevent oversized uploads#97
Darkshadow0409 wants to merge 2 commits intoindictechcom:masterfrom
Darkshadow0409:pr4-file-validation

Conversation

@Darkshadow0409
Copy link
Copy Markdown

Problem

Currently, there is no upper limit on file size before processing uploads, which can lead to unnecessary resource usage and performance issues.

Solution

This PR introduces a maximum file size validation (100MB) to prevent oversized uploads.

Changes made:

  • Added file size validation after download
  • Defined a maximum allowed size (100MB)
  • Automatically removes temporary files if size exceeds limit
  • Returns a structured error response

Benefits

  • Prevents unnecessary resource consumption
  • Improves system performance and stability
  • Enhances user feedback with clear validation errors

Copilot AI review requested due to automatic review settings March 31, 2026 17:44
@Darkshadow0409
Copy link
Copy Markdown
Author

I added file size validation to prevent handling oversized uploads and reduce unnecessary resource usage.

This ensures better performance and improves overall reliability of the upload process.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds server-side validation to reject oversized downloaded files before they are processed/uploaded, aiming to reduce wasted work and improve reliability of the upload flow.

Changes:

  • Added a 100MB max file-size check after download, deleting the temp file and returning a validation error when exceeded.
  • Reshaped /api/task_status/<task_id> JSON response into a { success, state, data, errors } structure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread app.py
Comment on lines 339 to +352
response = {
"success": True,
"task_id": task_id,
"status": task.status,
"result": task.result if task.successful() else None,
}
"state": task.status,
"data": {},
"errors": []
}

# If task failed, include error information
if task.failed():
response["error"] = str(task.result)
if task.successful():
response["data"] = task.result

elif task.failed():
response["success"] = False
response["errors"] = [str(task.result)]
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /api/task_status/<task_id> response schema was changed from { task_id, status, result, error } to { success, task_id, state, data, errors }, but the frontend poller still destructures status, result, and error from this endpoint (see frontend/src/pages/Upload.js around its polling logic). This will break async uploads because the UI will never see SUCCESS/FAILURE or the task result.

To avoid a breaking change, either keep the original keys (or return both old and new keys during a transition) and/or update the frontend in the same PR to consume the new schema.

Copilot uses AI. Check for mistakes.
Comment thread app.py
Comment on lines +93 to +104
# Add max file size validation (100MB limit)
MAX_FILE_SIZE = 100 * 1024 * 1024 # 100 MB

if file_size > MAX_FILE_SIZE:
if os.path.exists(file_path):
os.remove(file_path)

return jsonify({
"success": False,
"data": {},
"errors": ["File size exceeds maximum allowed limit (100MB)"]
}), 400
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max-size check happens only after download_image() completes and the file is written locally. download_image() currently fetches the entire remote file into memory via r.content before writing it, so an oversized file can still cause high memory/disk usage (the main resource-risk this PR is trying to mitigate) even though later processing is blocked.

Consider enforcing the size limit during the download (e.g., streaming the response in chunks and aborting once the limit is exceeded, or rejecting early based on a Content-Length header when present).

Copilot uses AI. Check for mistakes.
Comment thread app.py
Comment on lines +93 to +95
# Add max file size validation (100MB limit)
MAX_FILE_SIZE = 100 * 1024 * 1024 # 100 MB

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_FILE_SIZE is defined inside the request handler, which makes it harder to reuse/configure consistently (e.g., alongside the existing 50MB sync/async threshold) and requires editing code to change the limit. Prefer defining this as a module-level constant or in config.yaml / app.config, and referencing it from here.

Copilot uses AI. Check for mistakes.
Comment thread app.py
"success": False,
"data": {},
"errors": ["File size exceeds maximum allowed limit (100MB)"]
}), 400
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an oversized upload, HTTP 413 Payload Too Large is a more specific status code than 400 Bad Request, and some clients handle it specially. Consider returning 413 here while keeping the same JSON error body.

Suggested change
}), 400
}), 413

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants