Skip to content

基准测试记录环境元数据#14

Open
ghangz wants to merge 1 commit into
MetaX-MACA:mainfrom
ghangz:mengz/benchmark-env-metadata
Open

基准测试记录环境元数据#14
ghangz wants to merge 1 commit into
MetaX-MACA:mainfrom
ghangz:mengz/benchmark-env-metadata

Conversation

@ghangz

@ghangz ghangz commented Jun 8, 2026

Copy link
Copy Markdown

该 PR 让 benchmark 输出携带 MACA 工具链、设备和运行参数信息,使性能数据具备可复现性,便于比较不同镜像或驱动版本的结果。

这个修改面向沐曦 GPU 适配场景中比较容易影响开发、构建或验证稳定性的环节,把原来需要人工排查的问题前移到工具链、运行前检查或基准脚本中处理。实现上保持对现有默认行为的兼容,只在检测到明确配置、输入或环境异常时给出更直接的诊断,避免引入额外运行依赖,也方便维护者独立审阅该分支。

已在沐曦算力环境中完成对应分支验证,验证记录包含真实运行日志、命令输出和失败路径检查,本地归档目录为:E:/Documents/muxi/测试报告/FlashMLA_real_maca_validation_20260608。提交分支:mengz/benchmark-env-metadata,目标仓库:MetaX-MACA/FlashMLA

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new --metadata-json command-line argument to the bench_flash_mla.py benchmark script, allowing users to export environment metadata (such as Python, PyTorch, CUDA, MACA versions, and the current git commit) to a JSON file. Feedback on these changes suggests returning None instead of raw exception strings in _run_command to keep the JSON output clean, and ensuring that any missing parent directories for the metadata file are automatically created to prevent runtime errors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +515 to +521
def _run_command(command):
try:
return subprocess.check_output(
command, stderr=subprocess.STDOUT, text=True
).strip()
except (OSError, subprocess.CalledProcessError) as err:
return str(err)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

_run_command 中,当命令执行失败(例如未安装 git,或者当前目录不是 git 仓库)时,返回原始的异常字符串(如 Command '['git', 'rev-parse', 'HEAD']' returned non-zero exit status 128.)会导致输出的 JSON 元数据显得杂乱且不易解析。建议在发生异常时直接返回 None,这样在 JSON 中会呈现为 null,更加整洁规范。

Suggested change
def _run_command(command):
try:
return subprocess.check_output(
command, stderr=subprocess.STDOUT, text=True
).strip()
except (OSError, subprocess.CalledProcessError) as err:
return str(err)
def _run_command(command):
try:
return subprocess.check_output(
command, stderr=subprocess.STDOUT, text=True
).strip()
except (OSError, subprocess.CalledProcessError):
return None

Comment on lines +553 to +556
def write_benchmark_metadata(path):
with open(path, "w", encoding="utf-8") as fout:
json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True)
fout.write("\n")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

write_benchmark_metadata 中,如果用户为 --metadata-json 指定了一个包含多级目录的路径(例如 output/metadata.json),且该父目录不存在,open(path, "w") 将会抛出 FileNotFoundError 导致基准测试中断。建议在写入前使用 path.parent.mkdir(parents=True, exist_ok=True) 自动创建父目录,以提高脚本的健壮性。

Suggested change
def write_benchmark_metadata(path):
with open(path, "w", encoding="utf-8") as fout:
json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True)
fout.write("\n")
def write_benchmark_metadata(path):
path = Path(path)
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as fout:
json.dump(collect_benchmark_metadata(), fout, indent=2, sort_keys=True)
fout.write("\n")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant