HADOOP-19847. Move logAllocatedBlock out of lock in FSNamesystem.getAdditionalBlock to reduce latency#8359
HADOOP-19847. Move logAllocatedBlock out of lock in FSNamesystem.getAdditionalBlock to reduce latency#8359CapMoon wants to merge 1 commit intoapache:trunkfrom
Conversation
…dditionalBlock to reduce latency
|
I think this makes sense. cc @cnauroth, who is the author of this part of the code. |
|
@CapMoon thanks report, please first create HDFS-XXX issue, thanks |
|
🎊 +1 overall
This message was automatically generated. |
|
"please first create HDFS-XXX issue, thanks" My bad....Will migrate to #8360 and https://issues.apache.org/jira/browse/HDFS-17896 @pan3793 @haiyang1987 @cnauroth |
edwardcapriolo
left a comment
There was a problem hiding this comment.
Non binding. But I believe you do away with the atomic -ref. Some favor atomic ref over volatile. Could add a test for sure
| try { | ||
| newBlock = FSDirWriteFileOp.storeAllocatedBlock(ns, src, | ||
| HdfsConstants.GRANDFATHER_INODE_ID, "clientName", null, targets); | ||
| HdfsConstants.GRANDFATHER_INODE_ID, "clientName", null, targets, new AtomicReference<>()); |
There was a problem hiding this comment.
The test should use like a mockito-spy or verify to show that the logging happens as as side effect, and possibly when it doesn't happen, (eror etc)
|
he just wants to use an atomic ref to carry another return value? or maybe it can be changed to - LocatedBlock storeAllocatedBlock(args ...)
+ Pair<LocatedBlock, BlockInfo> storeAllocatedBlock (args ...)I think either is fine |
HADOOP-19847. Move logAllocatedBlock out of lock in FSNamesystem.getAdditionalBlock to reduce latency
Description of PR
The logAllocatedBlock method in FSNamesystem.getAdditionalBlock is currently called while holding global lock. Flame graph analysis shows this logging path (via SLF4J/Log4j appenders) contributes non-trivial latency, blocking other NameNode operations.
Since logAllocatedBlock is only for audit/diagnostic logging and does not modify shared state, we can safely move it after releasing global lock to reduce lock hold time and improve write throughput.
This change preserves all existing logging behavior while eliminating unnecessary lock contention from I/O-bound logging operations.
How was this patch tested?
N/A
