Skip to content

feat: add retry with exponential backoff for Loki chunk fetches#30

Merged
noexecstack merged 1 commit intomainfrom
feat/loki-retry-backoff
Apr 2, 2026
Merged

feat: add retry with exponential backoff for Loki chunk fetches#30
noexecstack merged 1 commit intomainfrom
feat/loki-retry-backoff

Conversation

@noexecstack
Copy link
Copy Markdown
Owner

Summary

  • Add retry with exponential backoff (1s, 2s, 4s) for Loki chunk fetches that fail due to timeouts or connection resets (default: 3 retries, configurable via --loki-retries)
  • Add server-side "verdict": filter to all Loki queries, reducing data transfer by filtering out non-flow cilium-agent log lines at the Loki level
  • Print a warning summary after Loki fetch when retries or chunk failures occurred, with hints to adjust --loki-timeout or --loki-chunk
  • Flag LokiResult.partial when any chunk failed after exhausting retries

Test plan

  • All 140 existing tests pass
  • Ruff lint and format checks pass
  • Manual test with a data-heavy namespace (e.g. loki) to verify retries recover from transient timeouts
  • Verify --loki-retries 0 disables retries

Loki queries for data-heavy namespaces frequently time out, causing
chunks to silently return partial results. Add retry logic (default 3
attempts with 1s/2s/4s backoff) to _loki_fetch_chunk, a --loki-retries
CLI flag, a server-side verdict filter to reduce data transfer, and a
post-fetch warning summary when retries or failures occurred.
@noexecstack noexecstack merged commit eac3460 into main Apr 2, 2026
5 checks passed
@noexecstack noexecstack deleted the feat/loki-retry-backoff branch April 2, 2026 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant