Skip to content

Flaky test report: committed-code failures on 2026-05-08 #259

@andrross

Description

@andrross

Flaky test report: committed-code failures on 2026-05-08

Summary

6 distinct test failures were observed against committed code (Timer/main or Post Merge Action builds) in the 24 hours ending 2026-05-08T10:00 UTC. These failures span 5 builds and affect 5 distinct test classes.

Summary Table (sorted by total unique builds affected historically)

Test Builds Affected (all time) First Seen Recent Build Pattern
IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNode 260 2024-05-07 76213 Worsening (resurgence since Mar 2026)
IndexingIT.testIndexingWithSegRep 255 2024-03-25 76233, 76157 Stable/chronic
ShardIndexingPressureSettingsIT.testShardIndexingPressureEnforcedEnabledDisabledSetting 155 2024-03-26 76226 Worsening (10 builds in May 2026 so far)
FlightMetricsTests.testComprehensiveMetrics 71 2025-07-25 76138 Stable/chronic
NodeJoinLeftIT.testClusterStabilityWhenJoinRequestHappensDuringNodeLeftTask 12 2025-07-16 76234 Low-frequency, stable

Detailed Findings

1. NodeJoinLeftIT.testClusterStabilityWhenJoinRequestHappensDuringNodeLeftTask

  • Build: 76234 (Timer/main)
  • Seed: C95701CDA4A1334E
  • Error: java.lang.AssertionError: Expected: is <false> but: was <true> at NodeJoinLeftIT.java:202
  • Reproduced locally: No (passed with seed)
  • First seen: 2025-07-16
  • Total builds affected: 12
  • Monthly pattern: 2 (Jul 2025), 1 (Aug), 1 (Sep), 2 (Nov), 1 (Jan 2026), 3 (Feb), 1 (Apr), 1 (May)
  • Assessment: Low-frequency flake, not seed-deterministic. Likely a timing-dependent race in cluster coordination. Stable rate (~1-3/month).

2. IndexingIT.testIndexingWithSegRep

  • Builds: 76233 (Post Merge Action), 76157 (Timer/main)
  • Seeds: A89A8A8681E994E7, 2D7E279615E59D84
  • Error: java.lang.AssertionError: expected:<0> but was:<1> in waitForSearchableDocs (IndexingIT.java:121)
  • Reproduced locally: Could not attempt (requires JAVA17_HOME for rolling-upgrade BWC build)
  • First seen: 2024-03-25
  • Total builds affected: 255
  • Monthly pattern: Consistently 4-29 builds/month since inception. Recent months: 18 (Mar 2026), 5 (Apr), 8 (May so far)
  • Assessment: Chronic, high-frequency flake in rolling-upgrade segment replication test. The test waits for searchable docs after segment replication but the timing is unreliable across version boundaries. This is one of the most prolific flaky tests in the project.

3. ShardIndexingPressureSettingsIT.testShardIndexingPressureEnforcedEnabledDisabledSetting

  • Build: 76226 (Timer/main)
  • Seed: 176A0962FEBF5002
  • Error: java.lang.Exception: Test abandoned because suite timeout was reached. (>= 1200000 msec)
  • Reproduced locally: No (passed with seed in 18s)
  • First seen: 2024-03-26
  • Total builds affected: 155
  • Monthly pattern: Consistently present since 2024. Recent uptick: 6 (Apr 2026), 10 (May 2026 so far, only 8 days in)
  • Assessment: Worsening. The test hits the 20-minute suite timeout in CI but completes quickly locally. This is consistent with the April 2026 runner migration to m7a.8xlarge — faster CPUs may be causing more contention in the indexing pressure framework, leading to longer waits or deadlock-like conditions under CI load. The May 2026 rate (10 builds in 8 days) is significantly higher than historical averages.

4. IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNode

  • Build: 76213 (Timer/main)
  • Seed: BE63084457E5F542
  • Error: java.lang.Exception: Test abandoned because suite timeout was reached. (>= 1200000 msec)
  • Reproduced locally: No (passed with seed in 68s)
  • First seen: 2024-05-07
  • Total builds affected: 260
  • Monthly pattern: Massive spike in May-Jun 2024 (120+119 builds), then nearly dormant until Mar 2026 (3), Apr 2026 (12), May 2026 (3)
  • Assessment: Worsening (resurgence). This test was largely fixed after the Jun 2024 spike but has returned since March 2026. The failure mode is suite timeout, suggesting the test hangs or takes excessively long under certain conditions. The resurgence correlates with the April 2026 CI runner change.

5. FlightMetricsTests.testComprehensiveMetrics

  • Build: 76138 (Post Merge Action)
  • Seed: 2E69A50FFD89D2D0
  • Error: BindTransportException[Failed to bind to [/0:0:0:0:0:0:0:1%lo, /127.0.0.1]:PortsRange{portRange='25401'}] — Address already in use
  • Reproduced locally: No (passed with seed)
  • First seen: 2025-07-25
  • Total builds affected: 71
  • Monthly pattern: 4-11 builds/month consistently since Jul 2025. Recent: 11 (Apr 2026), 6 (May so far)
  • Assessment: Stable/chronic port-binding flake. The test binds to a fixed port (25401) and fails when another process or a prior test run still holds the port. This is a classic environmental flake — not seed-deterministic, depends on CI host state.

Reproduction Summary

Test Seed Deterministic? Notes
NodeJoinLeftIT No Passed locally with CI seed
IndexingIT N/A Requires JAVA17_HOME for BWC build
ShardIndexingPressureSettingsIT No Suite timeout in CI; passes in 18s locally
IndicesRequestCacheIT No Suite timeout in CI; passes in 68s locally
FlightMetricsTests No Port binding conflict; passes locally

None of the reproducible tests failed deterministically with their CI seeds, which is expected for timing/environment-dependent flakes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions