Skip to content

[BugFix][Runtime] Sanitize empty Ascend RT visible devices#1

Merged
moonandlife merged 2 commits into
mainfrom
ws/fix-actions-26145186169-26145186140
May 23, 2026
Merged

[BugFix][Runtime] Sanitize empty Ascend RT visible devices#1
moonandlife merged 2 commits into
mainfrom
ws/fix-actions-26145186169-26145186140

Conversation

@moonandlife
Copy link
Copy Markdown
Contributor

What this PR does / why we need it?

This hardens the Ascend runtime environment builder used by hust-ascend-manager runtime check.

When ASCEND_RT_VISIBLE_DEVICES is present but empty or whitespace-only, torch_npu can report npu_available=false and device_count=0 even when host NPUs are healthy. This PR:

  • normalizes visible-device environment values
  • ignores empty ASCEND_RT_VISIBLE_DEVICES
  • falls back to ASCEND_VISIBLE_DEVICES when a valid runtime-visible set is not present
  • adds regression tests covering empty and explicit visibility values

This is related to vLLM-HUST/vllm-hust PR 40, where benchmark preflight failed with torch_npu_import_ok=true but device_count=0.

Does this PR introduce any user-facing change?

Yes. CI and scripted runtime checks stop misreporting zero visible NPUs when the parent environment exports an empty ASCEND_RT_VISIBLE_DEVICES.

How was this patch tested?

  • /root/miniconda3/envs/vllm-hust-dev/bin/python -m pytest tests/test_doctor.py tests/test_runtime.py -q
  • Reproduced the failure locally with ASCEND_RT_VISIBLE_DEVICES='' ... hust_ascend_manager.cli runtime check ...
  • Verified the same reproduction reports visible NPUs after this fix

Duplicate-work check

No overlapping open PR was found in vLLM-HUST/ascend-runtime-manager for this runtime visibility sanitization.

AI assistance

AI assistance was used via GitHub Copilot.

moonandlife and others added 2 commits May 21, 2026 13:07
- ignore empty ASCEND_RT_VISIBLE_DEVICES values during runtime env construction
- fall back to ASCEND_VISIBLE_DEVICES when runtime visibility is unset
- add regression tests covering empty and explicit visibility values

Co-authored-by: GitHub Copilot <copilot@github.com>
Signed-off-by: moonandlife <moonandlife@qq.com>
Document the ASCEND_RT_VISIBLE_DEVICES normalization and non-standard runtime version detection added on the PR 1 branch.

Signed-off-by: moonandlife <moonandlife@qq.com>
@moonandlife moonandlife merged commit 5c42c9d into main May 23, 2026
1 check passed
@moonandlife moonandlife deleted the ws/fix-actions-26145186169-26145186140 branch May 23, 2026 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant