Discussion: `attempt_number` semantics diverge from Hookdeck API

## Current outpost behavior

- `attempt_number` is 0-indexed (starts at `0`)
- Automated retries increment sequentially (`0, 1, 2, ...`)
- Manual retries always produce `attempt_number=0` because the retry path doesn't have context of how many prior attempts exist for the event+destination pair — so a manual retry after automated retries (0, 1, 2) produces a duplicate `attempt_number=0`

## Hookdeck API behavior

- `attempt_number` is 1-indexed (starts at `1`)
- Both automated and manual retries increment the number

## Questions to resolve

### 1. Indexing
Should we align with Hookdeck's 1-indexed convention?

### 2. Manual retry incrementing and retry schedule interaction
Should manual retries look up the prior attempt count and continue incrementing? If so, how does this affect the retry schedule? E.g. if the schedule allows 5 retries over 24 hours and a user manually retries 4 times (all failing), does that exhaust the schedule — leaving only 1 automated retry remaining? If manual retries should not count toward the schedule limit, we effectively need two separate concepts: the automated retry counter (used by the scheduler to determine remaining retries) and the total attempt number (including manual).

### 3. Persistence vs derivation
Currently `attempt_number` is persisted on the `Attempt` record. Should we instead derive it at read time (e.g. by ordering attempts for an event+destination by time and assigning a sequence number)? Deriving avoids needing writers to agree on the correct value, but adds read-time complexity.

### 4. `RetryTask.attempt_number` reliability
Currently the retry message carries the next `attempt_number` (set by `RetryTaskFromDeliveryTask` as `task.Attempt + 1`). Should the retrymq handler instead calculate it at execution time (e.g. by counting existing attempts)? The current approach is unreliable because manual retries bypass `RetryTask` entirely, and there's a race condition where a manual and automated retry could both be enqueued to deliverymq — making the carried value stale. This is an extreme edge case but means the value on `RetryTask` is effectively unreliable.

### 5. Delivery queue exclusivity (related)
Should we enforce that only one delivery task for the same event+destination can be in the queue at a time? E.g. if an automated retry is pending in deliverymq, reject a manual retry (and vice versa — if a manual delivery task is queued, block automated retries from being enqueued). This would prevent duplicate concurrent deliveries and simplify the `attempt_number` correctness problem. Without this, we may need to calculate `attempt_number` during the deliverymq handler by querying the logstore for prior attempts — adding a dependency from deliverymq to logstore.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: `attempt_number` semantics diverge from Hookdeck API #662

Current outpost behavior

Hookdeck API behavior

Questions to resolve

1. Indexing

2. Manual retry incrementing and retry schedule interaction

3. Persistence vs derivation

4. `RetryTask.attempt_number` reliability

5. Delivery queue exclusivity (related)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: attempt_number semantics diverge from Hookdeck API #662

Description

Current outpost behavior

Hookdeck API behavior

Questions to resolve

1. Indexing

2. Manual retry incrementing and retry schedule interaction

3. Persistence vs derivation

4. RetryTask.attempt_number reliability

5. Delivery queue exclusivity (related)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Discussion: `attempt_number` semantics diverge from Hookdeck API #662

4. `RetryTask.attempt_number` reliability