Skip to content

增加算子库算子源码清单#41

Open
ghangz wants to merge 2 commits into
MetaX-MACA:mainfrom
ghangz:mengz/mcoplib-op-source-inventory
Open

增加算子库算子源码清单#41
ghangz wants to merge 2 commits into
MetaX-MACA:mainfrom
ghangz:mengz/mcoplib-op-source-inventory

Conversation

@ghangz

@ghangz ghangz commented Jun 10, 2026

Copy link
Copy Markdown

这次改动补上了算子库算子源码清单,主要是为了解决算子库构建和诊断流程里相关信息不够集中、人工整理成本较高的问题,让日常排查、验证和结果归档更直接。

实现上补充了对应工具或脚本逻辑,补上了对应测试,同时尽量保持现有用法不变,避免影响已有流程。

这一分支已经在沐曦算力环境完成实际验证,相关检查均已通过,现提交合入。

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a script to build a JSON inventory of operator source groups and adds corresponding unit tests. The reviewer identified a critical issue where the "native" group recursively includes files from other sub-groups (like "vllm" or "sglang") due to overlapping directory paths, leading to duplicate counting. The reviewer provided code suggestions to filter out these overlapping files from the "native" group and to update the unit tests to verify this deduplication logic.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +32 to +37
def build_inventory(root: Path) -> dict[str, object]:
groups: dict[str, object] = {}
for name, relative_dir in GROUPS.items():
files = _sources(root, relative_dir)
groups[name] = {"root": relative_dir, "count": len(files), "files": files}
return {"root": str(root), "groups": groups}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

在当前的实现中,GROUPS 中的 "native" 对应的路径是 "op"。由于 _sources 函数内部使用了 rglob("*") 进行递归搜索,这会导致 "op/vllm""op/sglang" 等子目录下的所有源文件也被重复统计到 "native" 分组中。这不仅导致数据冗余,也使得 "native" 分组的计数和文件列表不准确。

建议在构建清单时,从 "native" 分组中排除已被其他更具体的分组(如 vllm, sglang, lmdeploy, cv)包含的文件。

def build_inventory(root: Path) -> dict[str, object]:
    raw_files: dict[str, list[str]] = {}
    for name, relative_dir in GROUPS.items():
        raw_files[name] = _sources(root, relative_dir)

    # 排除其他特定分组中已包含的文件,避免在 "native" 中重复统计
    other_files = set()
    for name in GROUPS:
        if name != "native":
            other_files.update(raw_files[name])

    if "native" in raw_files:
        raw_files["native"] = [f for f in raw_files["native"] if f not in other_files]

    groups: dict[str, object] = {}
    for name, relative_dir in GROUPS.items():
        files = raw_files[name]
        groups[name] = {"root": relative_dir, "count": len(files), "files": files}
    return {"root": str(root), "groups": groups}

Comment on lines +13 to +23
def test_inventory_counts_group_sources(self):
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
(root / "op" / "vllm").mkdir(parents=True)
(root / "op" / "vllm" / "kernel.cu").write_text("", encoding="utf-8")
(root / "op" / "vllm" / "README.md").write_text("", encoding="utf-8")

inventory = build_inventory(root)

self.assertEqual(inventory["groups"]["vllm"]["count"], 1)
self.assertEqual(inventory["groups"]["vllm"]["files"], ["op/vllm/kernel.cu"])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

为了配合对 native 分组排重逻辑的修改,建议在单元测试中增加对 native 分组的断言,确保其不会错误地包含其他子分组(如 vllm)的文件,并且能正确统计属于 native 自身的源文件。

Suggested change
def test_inventory_counts_group_sources(self):
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
(root / "op" / "vllm").mkdir(parents=True)
(root / "op" / "vllm" / "kernel.cu").write_text("", encoding="utf-8")
(root / "op" / "vllm" / "README.md").write_text("", encoding="utf-8")
inventory = build_inventory(root)
self.assertEqual(inventory["groups"]["vllm"]["count"], 1)
self.assertEqual(inventory["groups"]["vllm"]["files"], ["op/vllm/kernel.cu"])
def test_inventory_counts_group_sources(self):
with tempfile.TemporaryDirectory() as tmp:
root = Path(tmp)
(root / "op" / "vllm").mkdir(parents=True)
(root / "op" / "vllm" / "kernel.cu").write_text("", encoding="utf-8")
(root / "op" / "vllm" / "README.md").write_text("", encoding="utf-8")
(root / "op" / "native_kernel.cu").write_text("", encoding="utf-8")
inventory = build_inventory(root)
self.assertEqual(inventory["groups"]["vllm"]["count"], 1)
self.assertEqual(inventory["groups"]["vllm"]["files"], ["op/vllm/kernel.cu"])
self.assertEqual(inventory["groups"]["native"]["count"], 1)
self.assertEqual(inventory["groups"]["native"]["files"], ["op/native_kernel.cu"])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant