Skip to content

[IR] Verify explicit local tile address alignment#875

Open
TaoTao-real wants to merge 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/fix-tile-addr-alignment-verify
Open

[IR] Verify explicit local tile address alignment#875
TaoTao-real wants to merge 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/fix-tile-addr-alignment-verify

Conversation

@TaoTao-real

Copy link
Copy Markdown
Contributor

Summary

  • Add IR verifier checks for constant explicit local tile addresses.
  • Reject negative or misaligned constant pto.alloc_tile addr and pto.pointer_cast(addrs=...) operands.
  • Add regressions for the 70660 unaligned UB address pattern and a 71232 aligned control case.

Motivation

  • The issue-1789 reduction shows that TASSIGN(tile, 70660) can hang on A3 even without explicit sync/TPipe complexity.
  • PTOAS currently lowers constant explicit tile addresses through runtime TASSIGN(tile, addr), so PTO-ISA's TASSIGN<Addr> static checks are not triggered.
  • Catching constant illegal addresses in PTOAS IR gives an early diagnostic instead of a board-side timeout.

Design

  • VEC/MAT/BIAS/SCALING local spaces require 32-byte alignment.
  • LEFT/RIGHT/ACC local spaces require 512-byte alignment.
  • GM/default spaces are ignored by this local tile address verifier.
  • Dynamic addresses are left unchanged because verifier cannot prove their alignment.

Testing

  • Local compatible worktree: ninja -C PTOAS/build pto-opt
  • Local compatible worktree: ptoas --pto-level=level3 test/basic/alloc_tile_addr_alignment_invalid.mlir exits 1 and FileCheck passes.
  • Local compatible worktree: ptoas --pto-level=level3 test/basic/pointer_cast_addr_alignment_invalid.mlir exits 1 and FileCheck passes.
  • Local compatible worktree: ptoas --pto-level=level3 test/basic/alloc_tile_addr_alignment_valid.mlir exits 0.
  • Clean main worktree: ninja -C build-pr PTOIR passes.

Note

  • A full ptoas build from current hw-native-sys/main on this Mac is blocked by unrelated local LLVM/VPTO API mismatch (LLVMHiFloat8Type / CallingConv::SimtEntry missing). The changed IR target itself builds in the clean main worktree.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces local address alignment verification for AllocTileOp and PointerCastOp in the PTO dialect. It adds helper functions to retrieve alignment requirements based on the memory space and verify that constant addresses are non-negative and properly aligned. Corresponding test cases are also added. The review feedback points out a logical issue in verifyConstantLocalAddress where negative address checks are performed before verifying if the memory space requires alignment (e.g., for global memory), and suggests reordering the checks to avoid incorrect rejections.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread lib/PTO/IR/PTO.cpp
Comment on lines +2852 to +2883
static LogicalResult verifyConstantLocalAddress(Operation *op, Value addr,
Attribute memorySpace,
int addrIndex = -1) {
std::optional<int64_t> constantAddr = getConstantIntValue(addr);
if (!constantAddr)
return success();

auto emitAddrError = [&]() {
InFlightDiagnostic diag = op->emitOpError();
if (addrIndex >= 0)
diag << "addr[" << addrIndex << "]";
else
diag << "addr";
return diag;
};

if (*constantAddr < 0)
return emitAddrError() << " must be non-negative, got " << *constantAddr;

std::optional<uint64_t> alignment =
getLocalAddressAlignmentBytes(memorySpace);
if (!alignment || *alignment == 0)
return success();

uint64_t unsignedAddr = static_cast<uint64_t>(*constantAddr);
if ((unsignedAddr % *alignment) != 0)
return emitAddrError()
<< " must be aligned to " << *alignment
<< " bytes for local tile memory space, got " << unsignedAddr;

return success();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of verifyConstantLocalAddress checks if constantAddr < 0 before checking the memory space alignment. This means that even for GM (Global Memory) or default spaces—which are supposed to be ignored by this local tile address verifier—a negative constant address (or a high address that sign-extends to a negative value) will be incorrectly rejected.

To fix this and ensure GM/default spaces are completely ignored (as well as improving efficiency by avoiding constant value resolution for non-local spaces), we should retrieve and check the alignment first. If the space is not a local space (i.e., no alignment is returned), we can return success() immediately.

static LogicalResult verifyConstantLocalAddress(Operation *op, Value addr,
                                                Attribute memorySpace,
                                                int addrIndex = -1) {
  std::optional<uint64_t> alignment =
      getLocalAddressAlignmentBytes(memorySpace);
  if (!alignment || *alignment == 0)
    return success();

  std::optional<int64_t> constantAddr = getConstantIntValue(addr);
  if (!constantAddr)
    return success();

  auto emitAddrError = [&]() {
    InFlightDiagnostic diag = op->emitOpError();
    if (addrIndex >= 0)
      diag << "addr[" << addrIndex << "]";
    else
      diag << "addr";
    return diag;
  };

  if (*constantAddr < 0)
    return emitAddrError() << " must be non-negative, got " << *constantAddr;

  uint64_t unsignedAddr = static_cast<uint64_t>(*constantAddr);
  if ((unsignedAddr % *alignment) != 0)
    return emitAddrError()
           << " must be aligned to " << *alignment
           << " bytes for local tile memory space, got " << unsignedAddr;

  return success();
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in the latest push. verifyConstantLocalAddress now checks the memory space first and returns early for non-local spaces before resolving the constant value, so GM/default spaces are fully ignored by this local tile verifier. I also added pointer_cast_addr_alignment_gm_ignored.mlir to cover the negative GM address case mentioned here. ninja -C build-pr PTOIR passes on the clean PR worktree.

@TaoTao-real

Copy link
Copy Markdown
Contributor Author

Follow-up suggestion for PTO-ISA

This PR adds the PTOAS-side IR verifier guard, but PTO-ISA should still add a second line of defense.

Current behavior observed during the issue-1789 analysis:

  • PTO-ISA already checks static-address form TASSIGN<Addr>(tile) through tassign_static_check.
  • PTOAS-generated code currently uses runtime-address form TASSIGN(tile, addr) for explicit tile addresses.
  • The runtime path eventually reaches TASSIGN_IMPL(tile, addr) and only assigns the tile data pointer, so the existing static alignment/range checks are bypassed.

Suggested PTO-ISA / codegen follow-ups:

  1. In PTOAS EmitC, when the tile address operand is a compile-time constant, prefer emitting TASSIGN<Addr>(tile) so the existing PTO-ISA static_assert checks catch invalid constant addresses during C++ compilation.
  2. In PTO-ISA runtime TASSIGN(T&, AddrType) / TASSIGN_IMPL, add a debug / CPU_SIM / camodel-only guard for tile and conv-tile integral addresses:
    • non-negative address
    • address alignment according to tile memory space
    • optionally addr + tile_storage_bytes <= buffer_capacity
  3. Keep the PTO-ISA check aligned with the same buffer traits used by tassign_static_check, so runtime diagnostics and compile-time diagnostics report the same contract.

This would avoid silent board-side timeout/hang if a future frontend or hand-written kernel still reaches the runtime TASSIGN(tile, addr) path with an illegal local tile address.

@TaoTao-real TaoTao-real force-pushed the codex/fix-tile-addr-alignment-verify branch from 4946cd3 to a5ac7da Compare June 27, 2026 09:07
@TaoTao-real

Copy link
Copy Markdown
Contributor Author

/run A3

@reedhecre

Copy link
Copy Markdown

已接收 /run a3,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre

Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:571a8aeb0ed4
  • 结果汇总:OK 217 / FAIL 4 / SKIP 1
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260629_094608_manual_pr875.log
  • 手动指令:/run a3
  • 触发人:TaoTao-real
  • 触发评论:[IR] Verify explicit local tile address alignment #875 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • qwen3_decode_incore_1 (run, exit=1)
  • post_rmsnorm (run, exit=1)
  • out_proj_residual (run, exit=1)
  • down_proj_residual (run, exit=1)

@reedhecre

Copy link
Copy Markdown

A3 板测失败详情:PR #875

qwen3_decode_incore_1

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507014 (/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260629_094608_manual_pr875/npu_validation/Qwen3DecodeA3/qwen3_decode_incore_1/main.cpp:124)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 3512882] 2026-06-29-10:12:08.342.723 (EZ9999):  The error from device(chipId:0, dieId:1), serial number is 142, there is an exception of aicore error, core id is 6, error code = 0, dump info: pc start: 0x124200000000, current: 0x124200000130, vec error info: 0, mte error info: 0x8d03000087, ifu error info: 0x212c1a0800180, ccu error info: 0x59000038, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:645]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0, 0) errorStr: timeout or trap error. fixp_error0 info: 0x3000087, fixp_error1 info: 0x8d, fsmId:0, tslot:0, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:658]
       Kernel task happen error, retCode=0x25, [aicore timeout].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1729]
       AICORE Kernel task happen error, retCode=0x25.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=46, report_stream_id=46, task_id=0, flip_num=0, fault kernel_name=qwen3_decode_incore_1, fault kernel info ext=qwen3_decode_incore_1, program id=0, hash=10023933964466908619.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       rtStreamSynchronize execution failed, reason=aicore timeout[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507014[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-06-29 10:12:09] ERROR: testcase failed (exit 1): qwen3_decode_incore_1
post_rmsnorm

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507034 (/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260629_094608_manual_pr875/npu_validation/Qwen3DecodeA3/post_rmsnorm/main.cpp:122)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 3842864] 2026-06-29-10:30:33.795.756 (EZ9999):  The error from device(chipId:0, dieId:1), serial number is 146, there is an exception of aivec error, core id is 14, error code = 0, dump info: pc start: 0x124200000000, current: 0x124200000420, vec error info: 0x6110dc3198, mte error info: 0xf6060000d0, ifu error info: 0x2ffffff080d40, ccu error info: 0x5000000800030, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:645]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0, 0) errorStr: timeout or trap error. fixp_error0 info: 0x60000d0, fixp_error1 info: 0xf6, fsmId:0, tslot:0, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:658]
       Kernel task happen error, retCode=0x30, [vector core timeout].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1729]
       AIV Kernel happen error, retCode=0x30.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=46, report_stream_id=46, task_id=0, flip_num=0, fault kernel_name=post_rmsnorm, fault kernel info ext=post_rmsnorm, program id=0, hash=1178605716447799068.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       rtStreamSynchronize execution failed, reason=vector core timeout[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507034[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-06-29 10:30:35] ERROR: testcase failed (exit 1): post_rmsnorm
out_proj_residual

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507015 (/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260629_094608_manual_pr875/npu_validation/Qwen3DecodeA3/out_proj_residual/main.cpp:139)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 4181215] 2026-06-29-10:30:45.065.270 (EZ9999):  The error from device(chipId:0, dieId:1), serial number is 147, there is an exception of fftsplus aicore error, core id is 7, error code = 0, dump info: pc start: 0x12420000038c, current: 0x1242000006a8, vec error info: 0, mte error info: 0x8f030000e0, ifu error info: 0x1000000000000, ccu error info: 0, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000080.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:645]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0x8000, 0) errorStr: When the D-cache reads and writes data to the UB, the response value returned by the bus is a non-zero value. fixp_error0 info: 0x30000e0, fixp_error1 info: 0x8f, fsmId:0, tslot:1, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:658]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1729]
       AICORE Kernel task happen error, retCode=0x26.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:mixCtx print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=46, report_stream_id=46, task_id=0, flip_num=0, fault kernel_name=out_proj_residual, fault kernel info ext=out_proj_residual, program id=0, hash=8619810169506060267.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       rtStreamSynchronize execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507015[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-06-29 10:30:46] ERROR: testcase failed (exit 1): out_proj_residual
down_proj_residual

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507015 (/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260629_094608_manual_pr875/npu_validation/Qwen3DecodeA3/down_proj_residual/main.cpp:139)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 4185906] 2026-06-29-10:30:57.162.281 (EZ9999):  The error from device(chipId:0, dieId:1), serial number is 148, there is an exception of fftsplus aicore error, core id is 9, error code = 0, dump info: pc start: 0x12420000038c, current: 0x124200000558, vec error info: 0, mte error info: 0x8d03000087, ifu error info: 0x1000000000000, ccu error info: 0, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000080.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:645]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0x8000, 0) errorStr: When the D-cache reads and writes data to the UB, the response value returned by the bus is a non-zero value. fixp_error0 info: 0x3000087, fixp_error1 info: 0x8d, fsmId:0, tslot:1, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:658]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1729]
       AICORE Kernel task happen error, retCode=0x26.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [AIC_INFO] after execute:mixCtx print end[FUNC:GetError][FILE:stream.cc][LINE:1475]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=46, report_stream_id=46, task_id=0, flip_num=0, fault kernel_name=down_proj_residual, fault kernel info ext=down_proj_residual, program id=0, hash=3640648363331706499.[FUNC:GetError][FILE:stream.cc][LINE:1475]
       rtStreamSynchronize execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507015[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-06-29 10:30:58] ERROR: testcase failed (exit 1): down_proj_residual

@reedhecre

reedhecre commented Jun 29, 2026

Copy link
Copy Markdown

Codex Review

该评论由 review 机器人自动更新。

  • PR: [IR] Verify explicit local tile address alignment #875 [IR] Verify explicit local tile address alignment
  • Author: TaoTao-real
  • Base/Head: main / codex/fix-tile-addr-alignment-verify
  • Head SHA: a5ac7daec668
  • Trigger: 检测到新的 open PR
  • Generated At: 2026-06-29T19:16:40Z
  • Status: completed

Summary

PR #875 still leaves the pto.tassign rebind path unchecked, so misaligned explicit local tile addresses can reach runtime despite the new verifier.

Findings

  1. P2 Misaligned explicit local addresses still slip through `pto.tassign` lib/PTO/IR/PTO.cpp:2984

verifyConstantLocalAddress() is only wired into AllocTileOp::verify() and PointerCastOp::verify(). TAssignOp::verify() still only checks type equality, even though pto.tassign is the level3/manual API for rebinding an existing local tile to a new explicit address (test/lit/pto/tassign_level3_loop_rebind.pto), and EmitC lowering forwards that address straight into TASSIGN. That means a misaligned constant rebind for vec/mat/bias/scaling or left/right/acc still compiles after this PR and can reach runtime unchecked, so the new “explicit local tile address alignment” contract is still bypassable on a supported user path.

@zhangstevenunity zhangstevenunity left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed by building this PR head (a5ac7da) against the patched LLVM and running the suite. The verifier itself is sound and regression-free: 714/714 existing lit tests pass, the check fires correctly for alloc_tile and pointer_cast (vec=32B, gm ignored, negative rejected), and the helper checks memory space before the constant value, so Gemini's negative-GM concern is already resolved. Direction is good and the fixes below are small.

Two changes needed before merge:

  1. (correctness) The contract is bypassable on its primary path: pto.tassign (the level3/manual rebind that lowers to TASSIGN(tile, addr), the exact form in issue-1789) is NOT covered. See inline in PTO.cpp. This is the Codex bot's P2 and is the single most important gap, since the motivating bug is literally a TASSIGN address.

  2. (testing) The 4 new regressions live in test/basic/*.mlir, which the CI lit suite (ninja check-pto, rooted at test/lit/, .pto-only) does not discover, so they never run in CI. See inline on the test file.

Also: P3 the 32/512 alignment values are unsourced (pto-isa TASSIGN has no such static check to mirror) and inconsistent with the tile fractal (ACC fractal=1024, MAT 512-aligned in practice); plus a minor duplication of mlir::getConstantIntValue.

Note on the wiring: the extraClassDeclaration { verify(); } form (no let hasVerifier = 1;) is non-idiomatic vs sibling ops, but it IS invoked here (the base Op<>::verifyInvariants hook always calls cast<Concrete>(op).verify()), so it is correct, not dead code.

Context: the A3 board run on this PR is currently RED (OK 217 / FAIL 4: qwen3_decode_incore_1, post_rmsnorm at runtime aclrtSynchronizeStream 507014). The verifier did not cause these (they compiled and failed at device sync), but it shows the issue-1789 runtime-hang family is not demonstrably fixed by a compile-time-only guard that also misses the tassign path.

Comment thread lib/PTO/IR/PTO.cpp
return intAttr.getValue().getSExtValue();
}

static LogicalResult verifyConstantLocalAddress(Operation *op, Value addr,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 (blocker): the new alignment contract is bypassable on the primary user path. verifyConstantLocalAddress is wired only into AllocTileOp::verify() and PointerCastOp::verify(), but not into TAssignOp::verify() (PTO.cpp:2984), which still only checks result type == tile type.

pto.tassign is the manual/level3 op that rebinds a declared tile to a new explicit address, and its EmitC lowering forwards that address straight into TASSIGN(tile, addr) (PTOToEmitC.cpp:6580) - i.e. the exact TASSIGN(tile, 70660) form named in this PR's motivation and in issue-1789. So pto.tassign %t, %c70660 : !pto.tile_buf<loc=vec, ...> still compiles after this PR and reaches the board unchecked.

The tile operand is a TileBufType carrying a memory space, so the fix is one call in TAssignOp::verify():
verifyConstantLocalAddress(getOperation(), getAddr(), cast<TileBufType>(getTile().getType()).getMemorySpace()).

(This matches the Codex bot's P2.)

Comment thread lib/PTO/IR/PTO.cpp
case AddressSpace::LEFT:
case AddressSpace::RIGHT:
case AddressSpace::ACC:
return 512;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: these 32 / 512 byte values are unsourced and inconsistent with the tile type's own fractal field.

  • pto-isa's a2a3/a5 TASSIGN_IMPL (include/pto/npu/a2a3/TAssign.hpp) has NO address-alignment static_assert, so the PR description's rationale ("PTO-ISA's TASSIGN static checks are not triggered") does not match the headers - there is no such check here to mirror.
  • The values disagree with practice: ACC tiles carry fractal=1024 yet this requires only 512, and MAT (L1) addresses are 512-aligned in every existing test/sample yet this requires only 32 - so an unaligned MAT/ACC base that would hang still passes this verifier (under-checking).

Please cite the HW source for these numbers (and note they are applied arch-agnostically to a2/a3/a5), or derive the requirement from the arch + fractal rather than hardcoding.

@@ -0,0 +1,11 @@
// RUN: not ptoas --pto-level=level3 %s 2>&1 | FileCheck %s

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 (blocker for the stated deliverable): these regression tests never run in CI. CI runs ninja check-pto (ci.yml:255), whose lit suite is rooted at test/lit/ with config.suffixes = ['.pto'] (test/lit/lit.cfg.py). test/basic/ is a brand-new sibling directory and these are .mlir files, so check-pto does not discover them - I built this PR and ran the suite: 714 tests discovered, none from test/basic. There is no test/CMakeLists.txt and no workflow references the top-level test/lit.cfg.py, so the only way these pass is by manually pointing llvm-lit at them.

Please move them under test/lit/ (e.g. test/lit/pto/) and rename to .pto (cf. test/lit/vpto/cube/textract_verify_invalid_right_layout.pto) so the regression actually guards against recurrence.

Minor related note: ptoas exits 0 on this verify failure (it prints the error + Error: Failed to parse MLIR but rc=0), so the not ptoas ... | FileCheck line passes via FileCheck on the piped output, not via the exit code - the PR description's "exits 1" claim does not match observed behavior.

Comment thread lib/PTO/IR/PTO.cpp
return std::nullopt;
}

static std::optional<int64_t> getConstantIntValue(Value value) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: this duplicates mlir::getConstantIntValue (mlir/Dialect/Utils/StaticValueUtils.h), which resolves folded / constant-like values via m_ConstantInt, not just a direct arith::ConstantOp. Reusing it would be more robust and let you drop this helper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants