基准测试记录环境元数据#14
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new --metadata-json command-line argument to the bench_flash_mla.py benchmark script, allowing users to export environment metadata (such as Python, PyTorch, CUDA, MACA versions, and the current git commit) to a JSON file. Feedback on these changes suggests returning None instead of raw exception strings in _run_command to keep the JSON output clean, and ensuring that any missing parent directories for the metadata file are automatically created to prevent runtime errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def _run_command(command): | ||
| try: | ||
| return subprocess.check_output( | ||
| command, stderr=subprocess.STDOUT, text=True | ||
| ).strip() | ||
| except (OSError, subprocess.CalledProcessError) as err: | ||
| return str(err) |
There was a problem hiding this comment.
在 _run_command 中,当命令执行失败(例如未安装 git,或者当前目录不是 git 仓库)时,返回原始的异常字符串(如 Command '['git', 'rev-parse', 'HEAD']' returned non-zero exit status 128.)会导致输出的 JSON 元数据显得杂乱且不易解析。建议在发生异常时直接返回 None,这样在 JSON 中会呈现为 null,更加整洁规范。
| def _run_command(command): | |
| try: | |
| return subprocess.check_output( | |
| command, stderr=subprocess.STDOUT, text=True | |
| ).strip() | |
| except (OSError, subprocess.CalledProcessError) as err: | |
| return str(err) | |
| def _run_command(command): | |
| try: | |
| return subprocess.check_output( | |
| command, stderr=subprocess.STDOUT, text=True | |
| ).strip() | |
| except (OSError, subprocess.CalledProcessError): | |
| return None |
| def write_benchmark_metadata(path): | ||
| with open(path, "w", encoding="utf-8") as fout: | ||
| json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True) | ||
| fout.write("\n") |
There was a problem hiding this comment.
在 write_benchmark_metadata 中,如果用户为 --metadata-json 指定了一个包含多级目录的路径(例如 output/metadata.json),且该父目录不存在,open(path, "w") 将会抛出 FileNotFoundError 导致基准测试中断。建议在写入前使用 path.parent.mkdir(parents=True, exist_ok=True) 自动创建父目录,以提高脚本的健壮性。
| def write_benchmark_metadata(path): | |
| with open(path, "w", encoding="utf-8") as fout: | |
| json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True) | |
| fout.write("\n") | |
| def write_benchmark_metadata(path): | |
| path = Path(path) | |
| path.parent.mkdir(parents=True, exist_ok=True) | |
| with open(path, "w", encoding="utf-8") as fout: | |
| json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True) | |
| fout.write("\n") |
该 PR 让 benchmark 输出携带 MACA 工具链、设备和运行参数信息,使性能数据具备可复现性,便于比较不同镜像或驱动版本的结果。
这个修改面向沐曦 GPU 适配场景中比较容易影响开发、构建或验证稳定性的环节,把原来需要人工排查的问题前移到工具链、运行前检查或基准脚本中处理。实现上保持对现有默认行为的兼容,只在检测到明确配置、输入或环境异常时给出更直接的诊断,避免引入额外运行依赖,也方便维护者独立审阅该分支。
已在沐曦算力环境中完成对应分支验证,验证记录包含真实运行日志、命令输出和失败路径检查,本地归档目录为:E:/Documents/muxi/测试报告/FlashMLA_real_maca_validation_20260608。提交分支:
mengz/benchmark-env-metadata,目标仓库:MetaX-MACA/FlashMLA。