Open
Conversation
fb2e7dc to
acfad89
Compare
# Conflicts: # roll/pipeline/sft/sft_pipeline.py
# Conflicts: # roll/configs/worker_config.py feat: fix ascend example fix: ascend rlvr yaml fix fix: megatron fix
e6d042f to
df7d186
Compare
1e7f794 to
0af5e74
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds support for Huawei Ascend NPU devices with Megatron-Core backend, enabling ROLL framework to run reinforcement learning training on NPU hardware.
Key Changes
1. Platform Detection Priority
File:
roll/platforms/__init__.pyChanges: Reordered platform detection to check NPU before CUDA.
Reason: NPU devices were incorrectly falling back to CUDA platform. Prioritizing NPU detection ensures NpuPlatform is properly initialized when torch_npu is available.
2. Device-Agnostic Operations
File:
roll/pipeline/base_worker.pyChanges:
3. MindSpeed Integration
File:
mcore_adapter/src/mcore_adapter/training_args.pyChanges: Added optional import of mindspeed.megatron_adaptor .
Reason: MindSpeed is Huawei's library providing NPU-specific Megatron optimizations. The adaptor patches Megatron-Core for NPU compatibility while maintaining GPU compatibility via try-except.
4. NPU Attention Mask Format
File:
roll/distributed/strategy/megatron_strategy.pyChanges: Added NPU-specific attention mask transformation to 4D format.
Reason: NPU requires 4D attention masks [B, 1, S, S] instead of standard 2D [B, S] . This hardware-specific requirement ensures correct attention computation on NPU.
5. Optimizer Compatibility
File:
roll/third_party/megatron/optimizer.pyChanges: Added support for no_weight_decay_cond , scale_lr_cond , lr_mult parameters.
6. Example Configurations
Files:
Reason: Provides ready-to-use NPU training examples demonstrating proper device mapping and strategy configuration for both DPO and RLVR pipelines.
Impact
Benefits:
Requirements