Improve backend error handling and input validation in upload API#94
Improve backend error handling and input validation in upload API#94Darkshadow0409 wants to merge 1 commit intoindictechcom:masterfrom
Conversation
…irco Task T415715
|
I have worked on improving basic error handling in the upload API by adding input validation and preventing unsafe assumptions in the code. I focused on keeping the changes minimal and aligned with the microtask scope, while improving reliability and consistency of API responses. |
There was a problem hiding this comment.
Pull request overview
This PR improves robustness of the /api/upload Flask endpoint by adding input validation and introducing a small helper to return structured JSON error responses instead of crashing on malformed input.
Changes:
- Added an
error_response()helper to standardize JSON error payloads. - Added basic validation for missing/invalid
srcUrland missingtrfilenamein/api/upload. - Added URL regex-match guarding to prevent unsafe indexing when parsing source project/lang.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| data = request.get_json() | ||
| src_url = urllib.parse.unquote(data.get('srcUrl')) |
There was a problem hiding this comment.
request.get_json() can return None (invalid/missing JSON), and urllib.parse.unquote(data.get('srcUrl')) will raise when data is None or when srcUrl is missing (because unquote(None) errors). Validate data is a dict and that srcUrl exists before calling unquote, or pass a safe default / use get_json(silent=True) and handle None with a structured error response.
| tr_project = data.get('trproject') | ||
| tr_lang = data.get('trlang') | ||
| tr_filename = data.get('trfilename') | ||
|
|
||
| #adding safe data access | ||
| if not tr_filename: | ||
| return error_response("Missing target filename", 400) | ||
| tr_filename = urllib.parse.unquote(tr_filename) | ||
| tr_endpoint = "https://" + tr_lang + "." + tr_project + ".org/w/api.php" | ||
|
|
There was a problem hiding this comment.
tr_lang and tr_project are used to build tr_endpoint via string concatenation, but they may be None if the request omits them, causing a TypeError before the later None not in (...) guard runs. Add explicit validation for required target fields (e.g., trlang, trproject) before constructing tr_endpoint, and return a consistent JSON error when missing/invalid.
| return jsonify({ | ||
| "success": False, | ||
| "data": {}, | ||
| "errors": [message] |
There was a problem hiding this comment.
New helper returns payload with an errors key, but other endpoints in this file sometimes return error (singular). If the goal is standardized error responses, align on one key across the API (or document/encode backward compatibility) to avoid confusing clients.
| "errors": [message] | |
| "error": message |
| #addinig url validation | ||
| if not match: |
There was a problem hiding this comment.
Typos in newly added comments: "addinig" should be "adding" (and consider capitalizing comments for readability).
Related Task
https://phabricator.wikimedia.org/T415715
Problem
The current backend error handling in the /api/upload endpoint is limited and assumes valid input in several places. This can lead to crashes (e.g., IndexError when regex does not match) and inconsistent error responses.
Solution
This PR improves error handling by adding basic input validation and preventing unsafe operations.
Changes made:
Benefits
Note
This PR focuses on minimal and safe improvements aligned with microtask requirements, without introducing major refactoring.