WebGPU WPA Reproduction Project

这是一个基于检索增强生成的代码优化项目

module_inference/：根据源代码与可观测性能数据生成 POG（Performance Optimization Guidance）
detection_module_rule_based/：从知识库中检索与当前瓶颈最相关的优化案例
detection_module_LLM_based/：根据 POG + 检索结果生成最终优化后的代码

说明：论文中的完整系统采用多模态 RAG、领域知识库、在线推理模型等方法；本项目默认使用可解释的规则 + TF-IDF 检索 + 可替换的 LLM 优化器。接口采用动态替换的，后续可以替换成 DeepSeek / Qwen / OpenAI API。

目录结构

src/
├── detection_module_LLM_based/
│   └── code_generator.py
├── detection_module_rule_based/
│   └── retriever.py
├── module_inference/
│   └── pog_generator.py
├── common/
│   ├── io_utils.py
│   └── models.py
└── pipeline.py

1. 离线阶段：知识库构建

论文第五章强调构建 WebGPU 领域知识库，并形成“问题/症状 -> POG -> 优化代码”的链路。本项目在 data/knowledge_base/ 中用 JSON 样例来表达这一结构：

symptoms：症状与观测特征
pog：对应的性能优化引导
before_code / after_code：优化前后代码片段

2. 在线阶段：WPA 推理流程

运行时执行以下流程：

读取输入代码 sample_webgpu_app.js
读取性能数据 sample_perf.json
module_inference 生成 POG
detection_module_rule_based 做 Top-K 检索
detection_module_LLM_based 生成优化后的代码
输出：
- pog.json
- retrieved_cases.json
- generation_summary.json
- optimized.js

安装

cd webgpu_wpa_repro
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

运行

PYTHONPATH=. python -m src.pipeline \
  --code examples/input/sample_webgpu_app.js \
  --perf examples/input/sample_perf.json \
  --kb data/knowledge_base \
  --top-k 5 \
  --out-dir examples/output

预期输出

运行后会在 examples/output/ 下看到：

pog.json：POG 列表
retrieved_cases.json：检索命中的知识库案例
generation_summary.json：生成器应用了哪些优化
optimized.js：自动生成的优化后代码

如何替换成真实 LLM

你可以把 src/detection_module_LLM_based/code_generator.py 中的 MockLLMOptimizer 替换成真实模型调用器：

输入：原始代码 + POG + Top-K 检索案例
输出：优化后的完整代码
建议保留当前接口：

optimize(source_code: str, pog_items: List[POGItem], cases: List[RetrievedCase]) -> GenerationResult

如何扩展知识库

向 data/knowledge_base/ 增加更多 JSON 即可。推荐字段：

{
  "case_id": "case_xxx",
  "title": "问题标题",
  "tags": "关键词",
  "symptoms": ["观测现象1", "观测现象2"],
  "pog": ["优化建议1", "优化建议2"],
  "before_code": "优化前",
  "after_code": "优化后"
}

当前实现覆盖的典型优化规则

高频 createBindGroup -> 缓存 BindGroup
同步 createComputePipeline -> createComputePipelineAsync
高频 queue.writeBuffer -> 批量写入
高频 mapAsync -> 减少读回，改双缓冲/延迟读取
CPU 高 / GPU 低 -> 合并命令与资源复用

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
prompt-strategy-demo(case2)		prompt-strategy-demo(case2)
samples		samples
src/webgpu_wpa_repro		src/webgpu_wpa_repro
webgpu-tools		webgpu-tools
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebGPU WPA Reproduction Project

目录结构

1. 离线阶段：知识库构建

2. 在线阶段：WPA 推理流程

安装

运行

预期输出

如何替换成真实 LLM

如何扩展知识库

当前实现覆盖的典型优化规则

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WebGPU WPA Reproduction Project

目录结构

1. 离线阶段：知识库构建

2. 在线阶段：WPA 推理流程

安装

运行

预期输出

如何替换成真实 LLM

如何扩展知识库

当前实现覆盖的典型优化规则

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages