补充 MACA 环境检查文档#18
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a smoke test script (tools/run_flash_mla_smoke.py) and updates the README.md to document its usage. The smoke test runs a small correctness case of FlashMLA against the PyTorch reference implementation. Feedback on the changes suggests adding defensive checks for CUDA availability and device index validity before initializing the device to prevent low-level errors and provide clearer error messages.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| device = torch.device(args.device) | ||
| if device.type != "cuda": | ||
| raise ValueError("FlashMLA smoke test requires a CUDA-compatible MACA device.") |
There was a problem hiding this comment.
在初始化 CUDA 设备之前,建议先检查 torch.cuda.is_available(),并验证指定的设备索引是否在可用设备范围内(device.index < torch.cuda.device_count())。如果环境未正确配置(例如 MACA 驱动未加载或 PyTorch 未编译 CUDA 支持),直接调用 torch.cuda.set_device 会抛出难以理解的底层错误。增加这些防御性检查可以提供更友好的错误提示。
if not torch.cuda.is_available():\n raise RuntimeError("CUDA is not available. Please check your MACA driver and PyTorch installation.")\n device = torch.device(args.device)\n if device.type != "cuda":\n raise ValueError("FlashMLA smoke test requires a CUDA-compatible MACA device.")\n if device.index is not None and device.index >= torch.cuda.device_count():\n raise ValueError(f"Device index {device.index} is out of range. Total available devices: {torch.cuda.device_count()}")
该 PR 补充 FlashMLA 在沐曦环境下的检查步骤和常见问题说明,让新用户能按统一流程完成构建前确认。
这个修改面向沐曦 GPU 适配场景中比较容易影响开发、构建或验证稳定性的环节,把原来需要人工排查的问题前移到工具链、运行前检查或基准脚本中处理。实现上保持对现有默认行为的兼容,只在检测到明确配置、输入或环境异常时给出更直接的诊断,避免引入额外运行依赖,也方便维护者独立审阅该分支。
已在沐曦算力环境中完成对应分支验证,验证记录包含真实运行日志、命令输出和失败路径检查,本地归档目录为:E:/Documents/muxi/测试报告/FlashMLA_real_maca_validation_20260608。提交分支:
mengz/document-maca-env-checks,目标仓库:MetaX-MACA/FlashMLA。