Skip to content

Add PTODSL pipe surface APIs#459

Draft
jimmychou0 wants to merge 2 commits into
mouliangyu:feature-vpto-backendfrom
jimmychou0:zjm/ptodsl-pipe-surface
Draft

Add PTODSL pipe surface APIs#459
jimmychou0 wants to merge 2 commits into
mouliangyu:feature-vpto-backendfrom
jimmychou0:zjm/ptodsl-pipe-surface

Conversation

@jimmychou0

@jimmychou0 jimmychou0 commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR adds the reusable PTODSL pipe surface/frontend API work, based on Zhendong404 commit
a382a51064e5c3c3a160377ee778dca88c5f91bf, plus the pipe transaction support needed by the FA PTODSL frontend path.

Main changes:

  • Add pto.pipe high-level namespace.
  • Add unified directional pipe constructors:
    • pto.pipe.c2v(...)
    • pto.pipe.v2c(...)
    • pto.pipe.bidirectional(...)
  • Use parameters to distinguish local tile-entry and A2/A3 global-entry L2G2L usage:
    • consumer_buf
    • gm_slot_buffer
    • gm_slot_tensor
    • slot_size
  • Add buffer helpers:
    • pto.reserve_buffer(...)
    • pto.import_reserved_buffer(...)
    • pto.gm_ptr(...)
  • Enable frontend pipe transactions:
    • pipe.push(...)
    • pipe.pop(...)
    • pipe.free(...)
  • Allow frontend pipe ops inside PTODSL @pto.cube / @pto.simd section scopes.
  • Tighten frontend pipe verifier checks for cube/vector kernel kind and wrong-side sections.
  • Update ptodsl/docs/user_guide/07-data-movement-ops.md with the new pipe surface usage.
  • Add unit tests, docs fragment fixtures, sample compile coverage, and invalid verifier regression tests.

Notes

The public API no longer exposes separate *_global / *_local constructor names. The pipe direction is part of the
constructor name, while local/global-entry behavior is selected by the provided operands.

gm_slot_buffer is the actual GM FIFO storage pointer. gm_slot_tensor is kept as the entry descriptor for type/
shape/slot-size inference and verification.

This PR is intentionally separate from the FA PTODSL PR and only covers the reusable PTODSL pipe surface/frontend API
work.

Validation

Run on dev-481211:

  • ninja -C build passed.
  • python3 ptodsl/tests/test_vector_cube_ops.py -v passed.
  • python3 ptodsl/tests/test_pipe_surface_sample_compile.py passed.
  • python3 ptodsl/tests/test_docs_as_test.py passed.
  • A3 globaltensor frontend inputs passed with ptoas --pto-arch=a3.
  • Wrong-side section verifier regression was manually checked and reports the expected error.

@jimmychou0 jimmychou0 marked this pull request as draft May 29, 2026 08:18
@jimmychou0 jimmychou0 force-pushed the zjm/ptodsl-pipe-surface branch 2 times, most recently from 6581133 to 9f41d97 Compare May 30, 2026 02:08
Comment thread ptodsl/docs/user_guide/07-data-movement-ops.md Outdated
Comment thread lib/PTO/IR/PTO.cpp Outdated

static bool isInsideSectionCube(Operation *op) {
return op->getParentOfType<pto::SectionCubeOp>() != nullptr;
for (Operation *parent = op; parent; parent = parent->getParentOp())

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥要改这里

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

主要是为了支持 PTODSL nested module 的 module-level pto.kernel_kind。遍历应该没有必要,我再修改一下

FrontendPipeHandleMap handlesById;
SmallVector<Operation *> frontendInitOps;
llvm::DenseMap<int32_t, Operation *> initOpById;
llvm::DenseMap<int64_t, Operation *> initOpByKey;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

把ID改成Key的原因是?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 key 是 (pipe id, kernel side),不是单纯 pipe id。原因是同一个 logical pipe id 会同时出现在 cube-side init 和 vector-side init 上,lowering 查找 handle 时需要按当前 op 所在 side 匹配

Comment thread test/lit/pto/frontend_pipe_module_kind_verify_invalid.pto
@jimmychou0 jimmychou0 force-pushed the zjm/ptodsl-pipe-surface branch from 9f41d97 to 90a92e9 Compare May 30, 2026 04:14
`push`) on the Cube side and C2V-consumer methods (`init_simd`, `pop`, `free`)
on the Vector side.

#### `pto.pipe.v2c_global(gm_slot_tensor, *, id, slot_size=None, nosplit=None)`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. mix kernel中可能使用多条pipe,push/pop 如何与数据通讯的pipe关联,Cube和Vector之间的pipe如何关联?IR上的接口是通过pipe的id以及push/pop中的id关联
  2. DSL上是否有必要区分global/local这些pipe的location?个人观点,只需要定义好接口参数的语义,通过kernel入口中的target参数区分需要使用哪几个参数即可。当前的设计可能存在兼容性的问题
  3. IR上为了A3/A5接口兼容,A3架构也需要把Consumer的地址传入pipe当中,不仅只有GM地址
  4. GM地址是否需要定义为gm_slot_tensor?因为其实只需要一个gm指针,gm_slot_tensor(TensorView)中的其它信息向下lowering的时候应该都用不上,有些冗余。

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 多 pipe 关联:
    现在通过 stable pipe id + direction + kernel side 关联。lowering 里 key 已改成 (pipe id, kernel side),文档也补了多 pipe 需要不同 stable id。

  2. global/local API:
    已去掉公开的 c2v_global/c2v_local/v2c_global/v2c_local,统一成 pto.pipe.c2v(...) / pto.pipe.v2c(...)。local/global-entry 由参数语义区分。

  3. A3 consumer 地址:
    已修改。global-entry lowering 现在同时要求 gm_slot_buffer 和 consumer local buffer,并传给InitializeL2G2LPipeOp

  4. gm_slot_tensor:
    gm_slot_buffer 是实际 GM 地址;gm_slot_tensor 暂时保留为 entry descriptor,用于 type/shape/slot_size 推导和verifier 校验。

@jimmychou0 jimmychou0 force-pushed the zjm/ptodsl-pipe-surface branch from 7eb3958 to 752d5ba Compare June 1, 2026 09:42
BLOCK: pto.constexpr = 128,
):
gm_view = pto.make_tensor_view(gm_slot_buffer, shape=[16, 16], strides=[16, 1])
c2v_buf = pto.reserve_buffer("c2v_fifo", size=8192, location="vec")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确认下,这里是否应该是import_reserve_buffer

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里写的不太对。 global-entry pipe 是 global-only GM FIFO,初始化时只需要gm_slot_tensor。应该不需要这些reserve_buffer和import_reserve_buffer,只需要gm_slot_tenser。

@pto.simd
def vector_kernel():
c2v.init_simd()
entry = c2v.pop(split=0, result_type=c2v.entry_type)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没看懂:

  1. result_type是什么
  2. pop出来的不是一个tile吗,为啥又去做view使用

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global-entry pop() 默认用 entry_type,返回的是当前 GM FIFO slot 的 TensorView descriptor;用户再从这个 descriptor 派生 sub-view,然后显式 tile.load。 示例改成 entry = c2v.pop(split=0),让它默认使用 pipe 的 entry_type。 并补充相关说明

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants