Skip to content

feat: add ackable message support with OnSuccess and Manual ack modes#109

Open
silvioramalho wants to merge 9 commits intomasterfrom
feat/add-kafka-ackable-message
Open

feat: add ackable message support with OnSuccess and Manual ack modes#109
silvioramalho wants to merge 9 commits intomasterfrom
feat/add-kafka-ackable-message

Conversation

@silvioramalho
Copy link
Copy Markdown
Collaborator

Motivation

In the current Kafka consumer implementation (KafkaAckMode.Eager), offsets are auto-committed by the Confluent client at the moment consumer.Consume() is called — before the business processing completes. This means that if a pod restarts while a message is still being processed, the offset has already been committed and the message is silently lost.

This PR introduces opt-in acknowledgement modes that allow upstream pipelines (e.g., iris-shared) to commit offsets only after confirmed processing, eliminating this silent loss window.


What Changed

New types (Take.Elephant.Kafka)

File Description
KafkaAckMode.cs Enum: Eager (default, legacy), OnSuccess, Manual
KafkaAckableMessage<T>.cs Message envelope carrying payload, headers, topic/partition/offset metadata, and an AcknowledgeAsync() delegate. Idempotent (double-ack is a no-op).
IKafkaAckableReceiverQueue<T>.cs Interface extending IKafkaReceiverQueue<T> with DequeueAckableAsync and DequeueAckableOrDefaultAsync
PartitionCommitTracker.cs Thread-safe per-partition high-water-mark tracker. Ensures offsets are committed only when contiguous — no gaps, no out-of-order commits.

Modified (KafkaReceiverQueue<T>)

  • Now implements IKafkaAckableReceiverQueue<T>
  • Constructor accepts optional KafkaAckMode ackMode = KafkaAckMode.Eager (fully backward-compatible)
  • When ackMode != Eager: sets EnableAutoCommit=false and EnableAutoOffsetStore=false; uses _ackableChannel (capacity 1) and PartitionCommitTracker internally
  • Existing DequeueAsync / DequeueOrDefaultAsync / DequeueWithHeadersAsync remain unchanged for Eager consumers
  • KafkaHeadersConverter.ToReadOnlyDictionary added as internal alias for ackable message construction

Tests (Take.Elephant.Tests)

File Coverage
KafkaReceiverQueueAckModeFacts.cs 10+ facts: Eager preserves legacy behavior; OnSuccess does not commit before ack; OnSuccess commits after ack; Manual without ack does not commit; Manual with ack commits; filter rejection acks; poison success acks; error before ack does not commit; double-ack is idempotent
PartitionCommitTrackerFacts.cs 10 facts: single offset, in-order advancement, out-of-order gap blocking, gap resolution when predecessor acks, multiple partitions isolated, double-ack idempotency

Ack Mode Semantics

Eager   → offset committed by Confluent auto-commit (before processing) [default, no change]
OnSuccess → offset committed only after AcknowledgeAsync() is called by the pipeline
Manual  → offset committed only when the application explicitly calls AcknowledgeAsync()

PartitionCommitTracker — no offset gap rule

Track(offset=5), Track(offset=6), Track(offset=7)

Acknowledge(7) → HWM stuck at 4 (5 and 6 not acked) → returns null → no commit
Acknowledge(5) → HWM=5 → 6 still pending → commit offset 6
Acknowledge(6) → HWM=7 → commit offset 8   ← batched advance

Backward Compatibility

  • Default KafkaAckMode.Eager preserves all existing behavior with zero code changes required in consumers.
  • KafkaReceiverQueue(ConsumerConfig, topic, serializer) signature unchanged (new ackMode parameter is optional with default Eager).
  • KafkaReceiverQueue(IConsumer, serializer, topic) signature unchanged (same).

Risks

Risk Mitigation
Duplicate processing after pod restart Expected and correct. Consumers using OnSuccess/Manual must be idempotent.
Offset stuck if AcknowledgeAsync is never called Partition HWM does not advance → Kafka redelivers on reconnect. No silent loss.
Out-of-order acks with parallel consumers PartitionCommitTracker holds back commits until contiguous. Safe but may delay commit window.
Breaking change to IKafkaReceiverQueue<T> implementors KafkaReceiverQueue now implements the wider IKafkaAckableReceiverQueue<T>. Cast to narrower interface still works.

Testing

dotnet test src/Take.Elephant.Tests/Take.Elephant.Tests.csproj --filter "Category=Kafka"

@silvioramalho silvioramalho changed the title feat(kafka): add ackable message support with OnSuccess and Manual ack modes feat: add ackable message support with OnSuccess and Manual ack modes Apr 29, 2026
@silvioramalho silvioramalho marked this pull request as ready for review April 29, 2026 16:01
Copilot AI review requested due to automatic review settings April 29, 2026 16:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds opt-in Kafka acknowledgement modes to prevent offsets from being committed before message processing completes, closing the “silent loss on restart” window present with eager auto-commit.

Changes:

  • Introduces KafkaAckMode (Eager, OnSuccess, Manual) and KafkaAckableMessage<T> for explicit post-processing acknowledgement.
  • Updates KafkaReceiverQueue<T> to support ackable dequeue operations and manual/controlled commits using a per-partition PartitionCommitTracker.
  • Adds test coverage for ack modes and contiguous-per-partition commit tracking behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/Take.Elephant.Kafka/KafkaAckMode.cs Adds acknowledgement mode enum and associated docs.
src/Take.Elephant.Kafka/KafkaAckableMessage.cs Adds ackable message envelope with idempotent AcknowledgeAsync.
src/Take.Elephant.Kafka/IKafkaAckableReceiverQueue.cs Adds receiver interface for ackable dequeue APIs.
src/Take.Elephant.Kafka/PartitionCommitTracker.cs Adds per-partition contiguous-ack tracking to compute safe commit offsets.
src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Implements ack modes, disables auto-commit for non-eager, exposes ackable dequeues, and commits offsets after ack.
src/Take.Elephant.Kafka/KafkaHeadersConverter.cs Adds an internal alias used when constructing ackable messages.
src/Take.Elephant.Tests/Kafka/PartitionCommitTrackerFacts.cs Adds unit tests for contiguous advancement / gap-holding commit tracking.
src/Take.Elephant.Tests/Kafka/KafkaReceiverQueueAckModeFacts.cs Adds unit tests verifying ack-mode behavior and idempotent ack semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Take.Elephant.Kafka/PartitionCommitTracker.cs Outdated
Comment thread src/Take.Elephant.Kafka/PartitionCommitTracker.cs
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs
Comment thread src/Take.Elephant.Kafka/KafkaAckMode.cs Outdated
…t discontinuities

Agent-Logs-Url: https://github.com/takenet/elephant/sessions/dd241f8a-83b8-43af-ba16-57aaa6c7ea57

Co-authored-by: silvioramalho <20154605+silvioramalho@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 29, 2026 22:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Take.Elephant.Kafka/PartitionCommitTracker.cs
Comment thread src/Take.Elephant.Kafka/KafkaAckableMessage.cs
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs
Copilot AI review requested due to automatic review settings April 30, 2026 15:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Comment on lines +267 to +278
try
{
var tpo = new TopicPartitionOffset(tp, new Offset(commitOffset.Value));
_consumer.StoreOffset(tpo);
_consumer.Commit(new[] { tpo });
}
catch (KafkaException)
{
// The partition was revoked or rebalanced before the commit
// could be persisted. Kafka will redeliver these messages to
// the new owner — safe to discard this exception.
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ack delegate only catches KafkaException around StoreOffset/Commit. If AcknowledgeAsync is called after CloseAsync/Dispose (or during shutdown), these calls can also throw ObjectDisposedException/InvalidOperationException and would currently bubble up to the caller, potentially failing the pipeline even though shutdown is in progress. Consider broadening the catch to include disposal/invalid-state exceptions (or checking _closed) and treating them similarly to the rebalance case.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +55
/// <summary>
/// Records that <paramref name="offset"/> was consumed. Must be called before
/// the message is dispatched downstream. Bootstraps the HWM on the first call
/// so that contiguity is relative to this run's starting offset.
/// </summary>
public void Track(long offset)
{
lock (_lock)
{
if (_lastCommitted == long.MinValue)
_lastCommitted = offset - 1;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Track() only bootstraps _lastCommitted on the first call and doesn’t handle offset discontinuities within the same assignment. If consumption ever jumps forward (e.g., due to a seek, or any scenario where the next delivered offset is not lastCommitted+1), Acknowledge() will wait forever for the missing offsets and commits will be stuck. Consider tracking the last seen consumed offset and resetting _lastCommitted/_acked when a discontinuity is detected (or exposing an explicit Reset(startOffset) that KafkaReceiverQueue can call when the starting position changes).

Copilot uses AI. Check for mistakes.
Comment thread src/Take.Elephant.Kafka/PartitionCommitTracker.cs Outdated
Comment on lines 43 to 49
public KafkaReceiverQueue(
ConsumerConfig consumerConfig,
string topic,
ISerializer<T> serializer,
IDeserializer<string> deserializer = null)
: this(
new ConsumerBuilder<Ignore, string>(consumerConfig)
.SetValueDeserializer(deserializer ?? new StringDeserializer())
.Build(),
serializer,
topic)
IDeserializer<string> deserializer = null,
KafkaAckMode ackMode = KafkaAckMode.Eager)
{
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack mode is only configurable via the ConsumerConfig constructor overload; the (bootstrapServers, topic, groupId, ...) convenience constructor still hard-codes the old signature and can’t pass a non-Eager ackMode through. To keep the API consistent, consider adding an optional KafkaAckMode parameter to the convenience constructor and forwarding it to this overload.

Copilot uses AI. Check for mistakes.
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Comment thread src/Take.Elephant.Kafka/KafkaReceiverQueue.cs Outdated
Copilot AI review requested due to automatic review settings April 30, 2026 15:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +231 to +241
public async Task<KafkaAckableMessage<T>> DequeueAckableOrDefaultAsync(CancellationToken cancellationToken = default)
{
if (_ackMode == KafkaAckMode.Eager)
throw new InvalidOperationException(
$"DequeueAckableOrDefaultAsync requires KafkaAckMode.OnSuccess or Manual. Current mode: {_ackMode}.");

if (_ackableChannel.Reader.TryRead(out var entry))
return BuildAckableMessage(entry);

return null;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DequeueAckableOrDefaultAsync does not start the consumer task (unlike DequeueOrDefaultAsync / DequeueWithHeadersOrDefaultAsync). If callers rely on the queue to auto-start on first dequeue, this method will keep returning null because no background Consume loop is running. It is also marked async but has no awaits. Consider calling StartConsumerTaskIfNotAsync(cancellationToken) before TryRead (or removing async and returning a completed Task).

Copilot uses AI. Check for mistakes.
Comment thread src/Take.Elephant.Kafka/KafkaAckableMessage.cs Outdated
Copilot AI review requested due to automatic review settings April 30, 2026 21:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +22
/// <summary>
/// Offset is stored only after the caller explicitly invokes
/// <see cref="KafkaAckableMessage{T}.AcknowledgeAsync"/> following
/// successful processing. If the POD restarts before ack, the message
/// will be redelivered.
/// </summary>
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the OnSuccess summary the doc says “Offset is stored” after AcknowledgeAsync, but the implementation commits offsets directly (auto offset store is disabled and Commit(...) is used). Consider changing this wording to “committed” (and keep “stored” only when referring to Confluent’s local offset store) to avoid confusion about the delivery guarantees.

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +107
public KafkaReceiverQueue(IConsumer<Ignore, string> consumer, ISerializer<T> serializer, string topic,
KafkaAckMode ackMode = KafkaAckMode.Eager)
{
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The injected-IConsumer constructor used to exist without an ackMode parameter; changing its signature (even with a default value) is also a binary breaking change for downstream assemblies compiled against the previous ctor. Consider restoring the original 3-parameter constructor as an overload that calls this one with KafkaAckMode.Eager.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants