Improve resilience of data fetching with HTTP retry and WebSocket reconnection#188
Open
notshriyansh wants to merge 1 commit intoNetflix:masterfrom
Open
Improve resilience of data fetching with HTTP retry and WebSocket reconnection#188notshriyansh wants to merge 1 commit intoNetflix:masterfrom
notshriyansh wants to merge 1 commit intoNetflix:masterfrom
Conversation
- Adds exponential backoff retry (max 3 attempts) - Improves resilience when backend is unavailable - Lays groundwork for standalone mode support
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR makes the Metaflow UI's data-fetching layer more resilient when HTTP requests or WebSocket connections fail.
Changes-
HTTP Resilience -
Added retry mechanism with exponential backoff (up to 3 attempts) in useResource
Preserved pagination flow and existing request behavior
Improved handling of transient backend/network failures
WebSocket Improvements -
Enabled reconnection attempts for WebSocket connections
Added better error logging for connection failures
Motivation -
While testing locally, I noticed the UI can become unresponsive when backend services are unavailable or misconfigured. Right now, failed requests just error out with no recovery attempt.
These changes should make things more robust—especially important for future standalone mode, where backend availability might be less reliable.
Scope -
Currently, failed HTTP requests or WebSocket connection issues leave the UI in a degraded state with no retry or recovery mechanism for transient failures. This PR adds retry and reconnection handling to improve resilience in those cases.