Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
e312159
feat: first stage of vmi
mouliangyu Jun 17, 2026
ab6bc04
feat: support num_groups layout
mouliangyu Jun 18, 2026
9d63f30
feat: new layout-lowering design
mouliangyu Jun 21, 2026
50bffab
Add VMI layout assignment lowering coverage
mouliangyu Jun 22, 2026
d225422
Support S32 partial grouped mask lowering
mouliangyu Jun 22, 2026
bd18dc4
Support dynamic S32 grouped mask lowering
mouliangyu Jun 22, 2026
fcf1096
Clarify VMI layout case coverage gaps
mouliangyu Jun 22, 2026
6f04810
Record VMI layout coverage audit
mouliangyu Jun 22, 2026
4b3d5be
Add dynamic S32 group mask runtime coverage
mouliangyu Jun 22, 2026
604fd50
Detail VMI layout assignment request rules
mouliangyu Jun 22, 2026
e353ae0
Complete VMI layout request builder coverage
mouliangyu Jun 22, 2026
cf9a04d
Inline private VMI physical helpers before VPTO emission
mouliangyu Jun 22, 2026
e96ba6c
Validate required VMI selected plans
mouliangyu Jun 22, 2026
c1e74fb
Document VMI layout closure matrix
mouliangyu Jun 22, 2026
067f699
Add VMI dense reduce multi-consumer case
mouliangyu Jun 22, 2026
e550b80
Remove VMI selected plan attrs
mouliangyu Jun 22, 2026
bb88c2c
Implement VMI layout optimization pipeline
mouliangyu Jun 22, 2026
7686028
Support multi-chunk VMI group reduce slots
mouliangyu Jun 22, 2026
58787c2
Implement typed VMI group reduce lowering
mouliangyu Jun 22, 2026
221f02e
Implement VMI layout support lowering
mouliangyu Jun 23, 2026
85a98cb
Support partial packed VMI group slots
mouliangyu Jun 23, 2026
c9604ad
Support arith select in VPTO LLVM lowering
mouliangyu Jun 23, 2026
fd5fc11
Add VMI introduction design doc
mouliangyu Jun 23, 2026
46942f0
Fold deinterleaved VMI loads through vldsx2
mouliangyu Jun 23, 2026
910a2a9
Document VMI layout assignment mechanism
mouliangyu Jun 23, 2026
31abc14
Illustrate VMI layout equivalence classes
mouliangyu Jun 23, 2026
f5c27d4
Add VMI histogram lowering support
mouliangyu Jun 24, 2026
0f32642
Remove VMI load full read attribute
mouliangyu Jun 24, 2026
fbdda7b
Define VMI scatter as unique-index op
mouliangyu Jun 24, 2026
f9013c2
Add VMI group max quant kernel case
mouliangyu Jun 24, 2026
e3e3cc9
feat: add cce-aligned vmi kernel cases
mouliangyu Jun 24, 2026
caa2548
fix: adapt vgather2 u16 offset carrier
mouliangyu Jun 28, 2026
536ba5a
Rename VMI sparse layout to lane stride
mouliangyu Jun 28, 2026
e2658b4
Adjust VMI reduction result shapes
mouliangyu Jun 28, 2026
983417a
Add VMI simdvf per-block FP8 cast kernel case
mouliangyu Jun 28, 2026
df6c703
Support two-way VMI interleaved memory ops
mouliangyu Jun 29, 2026
5390335
Add VMI group broadcast E2B lowering
mouliangyu Jun 29, 2026
d484147
Rename VMI layout fold pass
mouliangyu Jun 29, 2026
2670797
Implement VMI relation-aware rematerialization
mouliangyu Jun 29, 2026
0b0826a
Optimize equivalent VPTO vcvt normalization
mouliangyu Jun 29, 2026
4c2c8fd
Run public LICM in VMI pipeline
mouliangyu Jun 29, 2026
5aafe73
Support VMI u8 to u16 integer extension
mouliangyu Jun 29, 2026
6b8399f
Optimize VMI trunci layout rematerialization
mouliangyu Jun 29, 2026
cfbef84
Generalize VMI dense lane-stride layouts
mouliangyu Jun 30, 2026
222f338
Support arity-driven VMI cast layouts
mouliangyu Jun 30, 2026
fdaf0c5
Clarify PTO-Gym validation skill scope
mouliangyu Jun 30, 2026
9b7c1db
Fix arity-driven VMI cast layout selection
mouliangyu Jun 30, 2026
46dc35e
test(vmi): use group-slot result shapes in runtime cases
mouliangyu Jun 30, 2026
7bf0108
Optimize VMI group broadcast load layout
mouliangyu Jul 1, 2026
7ddb090
Support compact VMI f32 to fp8 truncf layouts
mouliangyu Jul 2, 2026
1b67f26
docs: describe vmi layout propagation model
mouliangyu Jul 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
12 changes: 6 additions & 6 deletions .codex/skills/pto-gym-vpto-validation/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
---
name: pto-gym-vpto-validation
description: Run PTO-Gym validation from this PTOAS repo. Use when the user asks to run PTO-Gym SIM or board validation from the current source tree. Always force PTOAS onto the VPTO LLVM path instead of relying on the repo default backend.
description: Run bundled PTO-Gym exercise/validation cases. Use when the user explicitly asks for PTO-Gym, 3rdparty/PTO-Gym, or the PTO-Gym validation scripts. Always force PTOAS onto the VPTO path instead of relying on the repo default backend.
---

# PTO-Gym VPTO Validation

Use this skill when the task is specifically about:
- running `3rdparty/PTO-Gym/examples/pto/scripts/run_host_vpto_validation.sh`
- running `3rdparty/PTO-Gym/examples/pto/scripts/run_host_vpto_validation_parallel.sh`
- validating PTO-Gym cases from this PTOAS source tree
- validating bundled PTO-Gym exercise cases

## Required Rule

When PTO-Gym is run from this repo, do not rely on the default PTOAS backend.

Always pass PTOAS flags that force the VPTO LLVM path.
The current `ptoas` CLI spellings in this repo are `--pto-backend=vpto` and
`--vpto-emit-hivm-llvm`; do not shorten `--pto-backend` to `--backend`.
The current `ptoas` CLI spelling in this repo is `--pto-backend=vpto`; do not
shorten `--pto-backend` to `--backend`.

Use:

```bash
PTOAS_FLAGS='--pto-backend=vpto --vpto-emit-hivm-llvm --pto-arch a5'
PTOAS_FLAGS='--pto-backend=vpto --pto-arch a5'
```

If the caller already provides `PTOAS_FLAGS`, make sure these options are still
Expand All @@ -44,7 +44,7 @@ Typical simulator environment:
source /home/mouliangyu/.local/ascend/beta.2/cann-9.0.0-beta.2/set_env.sh
export ASCEND_HOME_PATH=/home/mouliangyu/.local/ascend/beta.2/cann-9.0.0-beta.2
export PTOAS_BIN=$PWD/build/tools/ptoas/ptoas
export PTOAS_FLAGS='--pto-backend=vpto --vpto-emit-hivm-llvm --pto-arch a5'
export PTOAS_FLAGS='--pto-backend=vpto --pto-arch a5'
```

## Canonical Commands
Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,11 @@ ptoas test/lit/pto/empty_func.pto --pto-arch=a5 -o outputfile.cpp
# 指定构建 Level(level3 会禁用 PlanMemory/InsertSync)
ptoas test/lit/pto/empty_func.pto --pto-level=level3 -o outputfile.cpp

# 启用实验性 VMI -> VPTO 语义 pipeline
# 该模式要求 --pto-backend=vpto,或输入 IR 中带 pto.backend = "vpto"
# public function signature 不能直接暴露 !pto.vmi.* 类型
ptoas test/lit/vmi/vmi_ptoas_cli_pipeline.pto --pto-arch=a5 --pto-backend=vpto --enable-vmi --emit-vpto -o -

# 查看当前 ptoas release 版本号
ptoas --version

Expand Down
Loading