Skip to content

fix: add jitter to retry backoff delays #4

@0xneobyte

Description

@0xneobyte

Problem

Retries use pure exponential backoff with fixed delays: 1s, 2s, 4s. When multiple clients encounter an error simultaneously (e.g. a brief backend outage), they all retry at exactly the same intervals. This creates a thundering herd — all clients hit the server at once on each retry wave, making recovery slower.

Proposed Behaviour

Add random jitter to retry delays to spread retries across a time window.

Current: delay = min(1000 * 2^attempt, 10000)
Fixed: delay = random(0, min(1000 * 2^attempt, 10000))

Full jitter is the most effective strategy for preventing thundering herd while keeping average latency reasonable.

Files to Modify

File Change
src/brainus_ai/client.py Add jitter to sleep duration in _make_request

Acceptance Criteria

  • Retry delays are randomised within the exponential backoff window
  • Maximum possible delay is unchanged (10s cap)
  • Minimum possible delay is 0
  • Retry count behaviour is unchanged
  • Tests updated to account for non-deterministic delay

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions