Skip to content

fix: Backfill truncates PR/Issue labels, reviews, and label-event history#69

Open
jonathanchang31 wants to merge 2 commits into
entrius:testfrom
jonathanchang31:fix/Backfill-GraphQL-queries
Open

fix: Backfill truncates PR/Issue labels, reviews, and label-event history#69
jonathanchang31 wants to merge 2 commits into
entrius:testfrom
jonathanchang31:fix/Backfill-GraphQL-queries

Conversation

@jonathanchang31
Copy link
Copy Markdown

Summary

  • Fixed a backfill data-loss bug where PR/Issue nested GraphQL data was truncated by fixed page sizes.
  • Backfill now retrieves full datasets for:
    • PR labels
    • PR reviews
    • PR label/unlabel timeline events
    • Issue labels
    • Issue label/unlabel timeline events

Related Issue

Closes: #68

Change Type (select all)

  • Bug fix
  • Refactor (query/fetch flow restructuring)
  • New feature
  • Documentation

Real Behavior Proof

  • Before:

    • PR query contained labels(first: 10), reviews(first: 10), timelineItems(first: 30).
    • Issue query contained labels(first: 10), timelineItems(first: 30).
    • Only partial nested nodes were persisted.
  • After:

    • Backfill loops call pagination helpers and iterate until pageInfo.hasNextPage is false.
    • Persisted labels/reviews/timeline events now include all pages for each PR/Issue.

Impact

  • Prevents undercounting in mirror data consumed by scoring/validation flows.
  • Improves historical accuracy for repos with high review/label activity.
  • Aligns backfill completeness with expected mirror reliability.

@xiao-xiao-mao xiao-xiao-mao Bot added the bug Something isn't working label May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Backfill truncates PR/Issue labels, reviews, and label-event history due to fixed GraphQL page sizes

1 participant