HDFS-17780. IBR retry bypasses configured interval on RPC failure by balodesecurity · Pull Request #8312 · apache/hadoop

balodesecurity · 2026-03-08T08:38:24Z

Problem

When blockReceivedAndDeleted RPC fails in IncrementalBlockReportManager.sendIBRs(), the lastIBR timestamp was only updated on success. A failed RPC left lastIBR stale, so sendImmediately() returned true on every subsequent heartbeat — the DataNode retried the IBR on every heartbeat cycle instead of waiting for the configured dfs.blockreport.incremental.intervalMsec.

Under high NameNode load this creates a positive feedback loop: failures trigger immediate retries, which increase NameNode contention, which causes more failures.

Fix

Move lastIBR = startTime out of the if (success) branch into the unconditional finally block. Whether the RPC succeeds or fails, lastIBR is updated so the interval is always respected between attempts.

Testing

Added TestIncrementalBlockReports#testFailedIBRRespectsInterval: creates an IncrementalBlockReportManager with a 60-second interval, injects a pending block, mocks the NameNode to throw IOException, calls sendIBRs(), and asserts sendImmediately() returns false immediately after the failure.
Test passes locally.

When blockReceivedAndDeleted RPC fails, lastIBR was only updated on success. This left lastIBR stale so sendImmediately() returned true on every subsequent heartbeat, creating a retry storm that amplifies NameNode contention instead of backing off. Fix: update lastIBR = startTime unconditionally in the finally block, regardless of whether the RPC succeeded or failed, so the configured dfs.blockreport.incremental.intervalMsec is always respected between attempts. Adds TestIncrementalBlockReports#testFailedIBRRespectsInterval: a pure-mock unit test that injects a failing NN RPC and asserts that sendImmediately() returns false immediately after the failure.

hadoop-yetus · 2026-03-08T16:46:59Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	13m 2s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	41m 11s		trunk passed
+1 💚	compile	1m 40s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	compile	1m 50s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	checkstyle	1m 51s		trunk passed
+1 💚	mvnsite	1m 56s		trunk passed
+1 💚	javadoc	1m 32s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 32s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	4m 11s		trunk passed
+1 💚	shadedclient	31m 12s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 19s		the patch passed
+1 💚	compile	1m 15s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javac	1m 15s		the patch passed
+1 💚	compile	1m 16s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	javac	1m 16s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	1m 16s		the patch passed
+1 💚	mvnsite	1m 26s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 1s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	3m 49s		the patch passed
+1 💚	shadedclient	30m 14s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	346m 3s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 46s		The patch does not generate ASF License warnings.
		487m 16s

Reason	Tests
Failed junit tests	hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes

Subsystem	Report/Notes
Docker	ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/1/artifact/out/Dockerfile
GITHUB PR	#8312
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux a9fc4226d7bb 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `6f2453f`
Default Java	Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions	/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/1/testReport/
Max. process+thread count	3478 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/1/console
versions	git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by	Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2026-03-19T17:37:15Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 35s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	42m 30s		trunk passed
+1 💚	compile	1m 45s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	compile	1m 42s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	checkstyle	1m 45s		trunk passed
+1 💚	mvnsite	1m 49s		trunk passed
+1 💚	javadoc	1m 23s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 24s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	4m 14s		trunk passed
+1 💚	shadedclient	31m 41s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 23s		the patch passed
+1 💚	compile	1m 17s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javac	1m 17s		the patch passed
+1 💚	compile	1m 22s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	javac	1m 22s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	1m 21s		the patch passed
+1 💚	mvnsite	1m 31s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 0s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	3m 59s		the patch passed
+1 💚	shadedclient	32m 27s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	220m 46s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 47s		The patch does not generate ASF License warnings.
		358m 42s

Reason	Tests
Failed junit tests	hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList

Subsystem	Report/Notes
Docker	ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/2/artifact/out/Dockerfile
GITHUB PR	#8312
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 56af1970f416 5.15.0-173-generic #183-Ubuntu SMP Fri Mar 6 13:29:34 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `8de084e`
Default Java	Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions	/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/2/testReport/
Max. process+thread count	3476 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/2/console
versions	git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by	Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2026-03-20T17:31:07Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 43s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	40m 18s		trunk passed
+1 💚	compile	1m 45s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	compile	1m 52s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	checkstyle	1m 48s		trunk passed
+1 💚	mvnsite	1m 55s		trunk passed
+1 💚	javadoc	1m 29s		trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 34s		trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	4m 9s		trunk passed
+1 💚	shadedclient	31m 20s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 19s		the patch passed
+1 💚	compile	1m 15s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javac	1m 15s		the patch passed
+1 💚	compile	1m 16s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	javac	1m 16s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	1m 15s		the patch passed
+1 💚	mvnsite	1m 25s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚	javadoc	1m 1s		the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚	spotbugs	3m 48s		the patch passed
+1 💚	shadedclient	29m 58s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	218m 9s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 49s		The patch does not generate ASF License warnings.
		346m 6s

Subsystem	Report/Notes
Docker	ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/3/artifact/out/Dockerfile
GITHUB PR	#8312
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 35312e0b3ae5 5.15.0-173-generic #183-Ubuntu SMP Fri Mar 6 13:29:34 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `ee86421`
Default Java	Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions	/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/3/testReport/
Max. process+thread count	3382 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8312/3/console
versions	git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by	Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

github-actions bot added HDFS trunk labels Mar 8, 2026

retrigger CI

8de084e

retrigger CI

ee86421

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS-17780. IBR retry bypasses configured interval on RPC failure#8312

HDFS-17780. IBR retry bypasses configured interval on RPC failure#8312
balodesecurity wants to merge 3 commits intoapache:trunkfrom
balodesecurity:HDFS-17780

balodesecurity commented Mar 8, 2026

Uh oh!

hadoop-yetus commented Mar 8, 2026

Uh oh!

hadoop-yetus commented Mar 19, 2026

Uh oh!

hadoop-yetus commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

balodesecurity commented Mar 8, 2026

Problem

Fix

Testing

Uh oh!

hadoop-yetus commented Mar 8, 2026

Uh oh!

hadoop-yetus commented Mar 19, 2026

Uh oh!

hadoop-yetus commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants