Skip to content

fix(rag): retrieve session details inside lock to prevent KeyError crash during PDF upload response construction#571

Merged
FireFistisDead merged 1 commit into
FireFistisDead:masterfrom
Sandeep6135:fix/unlocked-session-lookup-crash
Jun 18, 2026
Merged

fix(rag): retrieve session details inside lock to prevent KeyError crash during PDF upload response construction#571
FireFistisDead merged 1 commit into
FireFistisDead:masterfrom
Sandeep6135:fix/unlocked-session-lookup-crash

Conversation

@Sandeep6135

@Sandeep6135 Sandeep6135 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Pull Request: Resolve Uncaught KeyError and Worker Crash during PDF Upload Ingestion

Fixes #564

📌 Classification & Priority

Metric Value
Change Type bug-fix
Severity Level high / critical
Target Service rag-service
Concurrency Impact exceptional-crash-prevention
Fix Strategy Thread Lock Scope Alignment / Safe Dictionary Get

📖 Summary

Important

This PR fixes a critical unhandled exception defect in the process_pdf endpoint that crashed the worker process due to a KeyError when constructing the response payload for a processed PDF under high concurrency or memory eviction limits.

🔴 Problem

During PDF upload processing, the response construction accessed the global sessions dictionary at line 3326:
"session_secret": sessions[session_id].get("session_secret")
This access was performed outside the protection of the sessions_lock. Under high concurrency or cache eviction limits (MAX_ACTIVE_SESSIONS), the session could get evicted from memory by a concurrent thread immediately after the lock was released. This resulted in an uncaught KeyError: session_id which crashed the request handler.

🟢 Solution

Refactored the response construction block:

  1. Aligned variables by retrieving both documents and session_secret inside the with sessions_lock: block.
  2. Utilized safe dictionary get() fetches (sessions.get(session_id)) to handle potentially missing/evicted sessions gracefully instead of throwing a KeyError if the session gets garbage collected right before response serialization.

🧪 Steps to Reproduce

  1. Ingest multiple files concurrently to hit the maximum active session cap (MAX_ACTIVE_SESSIONS).
  2. Induce a session eviction trigger during the final phase of PDF upload serialization.
  3. Observe the server crash with KeyError on line 3326 and failure to return an API response.

🔍 Expected Behaviour

The API responds with the processed document metadata without crashing or throwing unhandled collection key errors.

❌ Actual Behaviour (Before Fix)

The Python worker process encountered a KeyError and crashed during response dictionary compilation, failing the upload request.


🛠️ Code Diff Walkthrough

rag-service/main.py

@@ -3314,7 +3314,9 @@ async def process_pdf(
 
 
     with sessions_lock:
-        documents = list(sessions[session_id].get("documents", []))
+        session = sessions.get(session_id)
+        session_secret = session.get("session_secret") if session else None
+        documents = list(session.get("documents", [])) if session else []
     update_processing_progress(
         session_id,
         "Completed",
@@ -3323,7 +3325,7 @@ async def process_pdf(
     return {
         "message": "PDF processed successfully",
         "session_id": session_id,
-        "session_secret": sessions[session_id].get("session_secret"),
+        "session_secret": session_secret,
         "document": uploaded_document,
         "documents": documents,
     }

Copilot AI review requested due to automatic review settings June 18, 2026 17:40
@vercel

vercel Bot commented Jun 18, 2026

Copy link
Copy Markdown

@Sandeep6135 is attempting to deploy a commit to the firefistisdead's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@Sandeep6135, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 45 minutes and 23 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 26a7127e-ca1d-4907-860e-61fef2dba8a4

📥 Commits

Reviewing files that changed from the base of the PR and between 5590b87 and b0d7c9f.

📒 Files selected for processing (1)
  • rag-service/main.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added backend Express or API gateway work bug Something isn't working enhancement New feature or request fix A targeted fix or cleanup level:critical rag-service FastAPI / model service work labels Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets a concurrency bug in the RAG service’s /process-pdf handler where building the success response could crash with a KeyError if the session entry was evicted from the global sessions map between unlocking and response serialization.

Changes:

  • Retrieves the session object via sessions.get(session_id) under sessions_lock instead of indexing sessions[session_id].
  • Captures session_secret and documents while holding sessions_lock and reuses session_secret in the response payload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rag-service/main.py
Comment on lines 3316 to +3319
with sessions_lock:
documents = list(sessions[session_id].get("documents", []))
session = sessions.get(session_id)
session_secret = session.get("session_secret") if session else None
documents = list(session.get("documents", [])) if session else []
Comment thread rag-service/main.py
Comment on lines 3316 to +3319
with sessions_lock:
documents = list(sessions[session_id].get("documents", []))
session = sessions.get(session_id)
session_secret = session.get("session_secret") if session else None
documents = list(session.get("documents", [])) if session else []
@FireFistisDead FireFistisDead added quality:exceptional gssoc:approved level:beginner and removed fix A targeted fix or cleanup rag-service FastAPI / model service work labels Jun 18, 2026
@FireFistisDead FireFistisDead merged commit 9d6530d into FireFistisDead:master Jun 18, 2026
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Express or API gateway work bug Something isn't working enhancement New feature or request gssoc:approved level:beginner level:critical quality:exceptional

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Uncaught KeyError and Worker Crash during PDF Upload Response Ingestion

3 participants