fix(rag): ensure touched session registry updates are persisted during flush#569
Conversation
|
@Sandeep6135 is attempting to deploy a commit to the firefistisdead's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Warning Review limit reached
More reviews will be available in 51 minutes and 51 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR fixes a persistence bug in the RAG service’s session lifecycle where “touch” updates (e.g., last_accessed / expires_at) for read-only session access were not being flushed to session_registry.json, causing active sessions to expire prematurely after restarts.
Changes:
- Updates
_flush_dirty_sessions()to only early-return when both_dirty_sessionsand_dirty_registry_sessionsare empty. - Drains
_dirty_registry_sessionsindependently and persists touched-session metadata tosession_registry.jsoneven when there are no per-session metadata writes.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "documents": list(meta.get("documents", [])), | ||
| "session_dir": meta.get("session_dir"), | ||
| "hashed_session_secret": meta.get("hashed_session_secret") or _hash_secret(meta.get("session_secret", "")), |
| with sessions_lock: | ||
| if not _dirty_sessions: | ||
| if not _dirty_sessions and not _dirty_registry_sessions: | ||
| return | ||
| dirty = set(_dirty_sessions) | ||
| _dirty_sessions.clear() | ||
| dirty_registry = set(_dirty_registry_sessions) | ||
| _dirty_registry_sessions.clear() |
Fixes #562
Pull Request: Resolve Session Expiration Registry Touch Persistence Bypass
📌 Classification & Priority
bug-fixhigh/criticalrag-serviceexceptional-state-integrity📖 Summary
Important
This PR resolves a critical session lifecycle bug where metadata touch updates (e.g.
last_accessedandexpires_atupdates) for read-only sessions were silently ignored during background flushes, resulting in premature cache expiration and deletion of active sessions.🔴 Problem
When a session is accessed (e.g., via
/askor/validate-session-write), it is touched to update itslast_accessedtime in memory. If a session is read-only (not modifying chat history or flashcards), the session ID is added only to_dirty_registry_sessionsand not to_dirty_sessions.However,
_flush_dirty_sessions()had an early exitif not _dirty_sessions: return. If no new chat updates occurred, the function returned immediately, completely bypassing_dirty_registry_sessions. Thus, touch metadata updates were never persisted tosession_registry.json. On server restart, active sessions reverted to their old timestamps, triggering premature garbage collection and deletion.🟢 Solution
Refactored
_flush_dirty_sessionsto drain and process_dirty_registry_sessionsindependently of_dirty_sessions:🧪 Steps to Reproduce
_dirty_registry_sessionsin memory.session_registry.jsonand observe that thelast_accessedandexpires_atfields contain the old, stale timestamps from before the read requests.🔍 Expected Behaviour
Every read request should touch the session and correctly write the updated
last_accessedandexpires_attimestamps tosession_registry.jsonon disk during the background flush loop.❌ Actual Behaviour (Before Fix)
Touch timestamps were not persisted to
session_registry.jsonunless a write request occurred simultaneously, causing read-only active sessions to expire and get deleted upon server restarts.🛠️ Code Diff Walkthrough
rag-service/main.py