Skip to content

KAFKA-20742: Start offset for newly initialized share partitions of known topics should be 0#22714

Merged
apoorvmittal10 merged 4 commits into
apache:trunkfrom
sjhajharia:KAFKA-20742
Jul 1, 2026
Merged

KAFKA-20742: Start offset for newly initialized share partitions of known topics should be 0#22714
apoorvmittal10 merged 4 commits into
apache:trunkfrom
sjhajharia:KAFKA-20742

Conversation

@sjhajharia

@sjhajharia sjhajharia commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

When a share group heartbeat detects subscribed-but-uninitialized
topic-partitions, the group coordinator builds an
InitializeShareGroupStateParameters request. Previously it sent
startOffset = -1 (PartitionFactory.UNINITIALIZED_START_OFFSET) for
every partition.

-1 is a sentinel meaning "not yet decided":
SharePartition.startOffsetDuringInitialization() resolves the real
offset from the group's share.auto.offset.reset strategy (default
LATEST). This is correct for a brand-new topic subscription, but
incorrect when new partitions are added to a topic the group already
knows about
. Resolving those new partitions via LATEST can skip
records produced before initialization completes, causing data loss.

This change makes newly initialized partitions of an already-known
topic
(present in the group's initializedTopics or
initializingTopics) start at offset 0, guaranteeing no records are
missed. Partitions of a topic seen for the first time keep -1 so that
share.auto.offset.reset still applies to fresh subscriptions.

Change

In GroupMetadataManager:

  • maybeCreateInitializeShareGroupStateRequest computes the set of
    already-known topics (initializedTopics ∪ initializingTopics) from the
    current persisted metadata. This is read before the heartbeat's records
    are replayed, so a brand-new topic is correctly absent.
  • buildInitializeShareGroupStateRequest selects the start offset per
    topic: 0 for an already-known topic, -1 for a first-sighting topic.

Testing

  • Added testShareGroupHeartbeatInitializeMixedNewAndExpandedTopics: a
    single heartbeat that both adds a brand-new topic and expands an
    already-initialized topic produces one request where the new topic uses
    -1 and the expanded topic's new partitions use 0 — verifying the
    decision is per-topic, not per-request.

Reviewers: Sushant Mahajan smahajan@confluent.io, Apoorv Mittal
apoorvmittal10@gmail.com

@github-actions github-actions Bot added triage PRs from the community group-coordinator labels Jul 1, 2026

@smjn smjn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Lets add an integ test to verify the change since it is in fundamental path. The scenario could be something like:

  • create a topic with 1 partition and produce 10 records
  • init share group with LATEST (do it explicitly)
  • share consumer subscribes and polls - should not receive anything
  • alter partitions (add one more)
  • produce to new partition 10 records
  • poll and assert on the new partitions 10 records

@smjn smjn added KIP-932 Queues for Kafka ci-approved and removed triage PRs from the community labels Jul 1, 2026
// time, present in neither map) use -1 so that the group's share.auto.offset.reset strategy
// applies. This is read before the records above are replayed, so a brand-new topic is
// correctly absent here.
ShareGroupStatePartitionMetadataInfo info = shareGroupStatePartitionMetadata.get(groupId);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good since if and when a request with new partitions on existing topic comes, it implies that previous requests have already produced the group coord records which have been updated in the info map.

}
return receivedValues.size() >= expectedValues.size();
}, 30000L, 500L, () -> "did not receive all records from the newly added partition");
assertEquals(expectedValues, receivedValues);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this cause flakiness if there are duplicate records?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Yes, I noticed that given the at-least-once semantics of Share Group, we may have flakes here. I have updated the same to assert on the size of Set.

@smjn smjn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, LGTM

@apoorvmittal10 apoorvmittal10 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minor nit. Let me know if you still prefer the current code.

Comment on lines +3143 to +3148
ShareGroupStatePartitionMetadataInfo info = shareGroupStatePartitionMetadata.get(groupId);
Set<Uuid> alreadyKnownTopics = new HashSet<>();
if (info != null) {
alreadyKnownTopics.addAll(info.initializedTopics().keySet());
alreadyKnownTopics.addAll(info.initializingTopics().keySet());
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Just to avoid unneccessary HashSet.

Set<Uuid> alreadyKnownTopics = null;
if (shareGroupStatePartitionMetadata.containsKey(groupId)) {
      ShareGroupStatePartitionMetadataInfo info = shareGroupStatePartitionMetadata.get(groupId);
      alreadyKnownTopics = new HashSet<>();
      alreadyKnownTopics.addAll(info.initializedTopics().keySet());
      alreadyKnownTopics.addAll(info.initializingTopics().keySet());
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I have updated the PR accordingly.

alreadyKnownTopics.addAll(info.initializingTopics().keySet());
}

return Optional.of(buildInitializeShareGroupStateRequest(groupId, groupEpoch, topicPartitionChangeMap, alreadyKnownTopics));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: and here if above change looks good

Suggested change
return Optional.of(buildInitializeShareGroupStateRequest(groupId, groupEpoch, topicPartitionChangeMap, alreadyKnownTopics));
return Optional.of(buildInitializeShareGroupStateRequest(groupId, groupEpoch, topicPartitionChangeMap, alreadyKnownTopics != null ? alreadyKnownTopics : Set.of()));

@apoorvmittal10 apoorvmittal10 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@apoorvmittal10 apoorvmittal10 merged commit 97bbffa into apache:trunk Jul 1, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants