Add streaming merge-join for remote sync sources#3461
Open
pbanakar-microsoft wants to merge 3 commits into
Open
Add streaming merge-join for remote sync sources#3461pbanakar-microsoft wants to merge 3 commits into
pbanakar-microsoft wants to merge 3 commits into
Conversation
- New syncMergeJoin.go: O(1) memory streaming merge-join for Blob/S3/BlobFS sources that guarantee lexicographic listing order - Disable memory/file/goroutine throttling for merge-join path to avoid ReadMemStats STW bottleneck - Set inner EnumerationParallelism=1 (outer crawl provides parallelism) - Default merge-join parallelism: 500 (configurable via AZCOPY_MERGE_JOIN_PARALLELISM) - CrawlWithStats: expose live ActiveWorkers/QueuedDirs counters - Fix syncComparator: both-zero change times no longer flags metadata changed - Add diagnostic [STEP]/[SLOW-STEP] logging for performance analysis
- Add isSelfReferentialDirSentinel() to detect BlobFS/GCP directory sentinels with empty relativePath - Schedule ACL copy transfers for sentinels instead of silently skipping - Prevent re-enqueueing sentinel dirs (avoids infinite loops) - Reduce mergeJoinChannelBufferSize from 10K to 1K (430KB vs 8MB per channel) - Track originalRelativePath before buildChildPath rewrites it
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Feature / Bug Fix: (Brief description of the feature or issue being addressed)
Related Links:
Issues
Team thread
Documents
[Email Subject]
Type of Change
How Has This Been Tested?
Thank you for your contribution to AzCopy!