Skip to content

feat(api): add batch document upload endpoint#665

Merged
param20h merged 2 commits into
param20h:devfrom
nancysangani:feature/batch-document-upload
Jun 23, 2026
Merged

feat(api): add batch document upload endpoint#665
param20h merged 2 commits into
param20h:devfrom
nancysangani:feature/batch-document-upload

Conversation

@nancysangani

Copy link
Copy Markdown
Contributor

Closes #435

📝 What does this PR do?

Adds POST /api/v1/documents/upload/batch so users can upload multiple files in a single request. Each file is validated, saved, and dispatched to Celery independently — failures for individual files are recorded in a failed[] list without blocking the rest of the batch.

Changes:

  • backend/app/schemas.py — new BatchUploadResponse schema
  • backend/app/routes/documents.py — new /upload/batch endpoint with per-file validation, DB persistence, Celery dispatch, and in-process fallback
  • backend/tests/test_batch_upload.py — 9 unit tests covering happy paths, partial failures, full failure, auth, chunk validation, Celery fallback, and DB persistence

🗂️ Type of Change

  • ✨ New feature
  • 🧪 Tests

🧪 How was this tested?

9 tests in backend/tests/test_batch_upload.py:

Test Asserts
test_batch_upload_single_file 202, 1 document created
test_batch_upload_multiple_files 202, 3 documents created
test_batch_upload_rejects_bad_extension bad file in failed[], good file accepted
test_batch_upload_all_files_fail_returns_400 400 when nothing succeeds
test_batch_upload_requires_auth 401/403 without token
test_batch_upload_invalid_chunk_size 400 for chunk_size=50
test_batch_upload_invalid_chunk_overlap 400 for overlap >= chunk_size
test_batch_upload_celery_fallback_uses_background_task task_id starts with local_
test_batch_upload_document_persisted_in_db DB row has correct fields
test_batch_upload_chunk_settings_stored custom chunk settings saved

✅ Self-Review Checklist

  • Branch based on dev, not main
  • No secrets / API keys added
  • No changes to main or HuggingFace deployment config
  • Follows existing code style (same pattern as /upload)
  • New endpoint placed before /urlupload to keep route specificity correct

@nancysangani nancysangani requested a review from param20h as a code owner June 22, 2026 08:50
@nancysangani

Copy link
Copy Markdown
Contributor Author

Hi @param20h please review the PR when you get a chance. Thanks!

@param20h param20h merged commit acf3915 into param20h:dev Jun 23, 2026
7 checks passed
@github-actions github-actions Bot added gssoc GirlScript Summer of Code 2026 issue/PR gssoc:approved Approved for GSSoC base points (+50 pts) level:intermediate +35 pts mentor:param20h Mentor for this PR type:backend Backend API labels Jun 23, 2026
@github-actions

Copy link
Copy Markdown

🎉 Congratulations on getting your Pull Request merged! 🎉

Thank you for contributing to PDF-Assistant-RAG as part of GSSoC '26! 🚀

Keep up the great work! ✨

@param20h param20h added type:feature +10 pts and removed type:backend Backend API labels Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gssoc:approved Approved for GSSoC base points (+50 pts) gssoc GirlScript Summer of Code 2026 issue/PR level:intermediate +35 pts mentor:param20h Mentor for this PR type:feature +10 pts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(api): Add batch document upload endpoint

2 participants