Skip to content

fix(sync): keep TLoad WAW barriers for same buffer#843

Open
TaoTao-real wants to merge 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/fix-tload-waw-subview
Open

fix(sync): keep TLoad WAW barriers for same buffer#843
TaoTao-real wants to merge 1 commit into
hw-native-sys:mainfrom
TaoTao-real:codex/fix-tload-waw-subview

Conversation

@TaoTao-real

Copy link
Copy Markdown
Contributor

Summary

  • Narrow the TLoad->TLoad WAW exemption to only dependency pairs formed by distinct direct subview SSA values.
  • Preserve the [Feature] Expose Subtile FIFO Loads/Stores Without Extra MTE Barriers #667 distinct-subview no-MTE2-barrier optimization under the PTOAS subview non-overlap contract.
  • Add regressions for same tile and same subview TLoad WAW hazards, and mark the if/else tile-result sample as pto.entry.

Motivation

  • The previous blanket MTE2 TLoad->TLoad WAW exemption skipped alias analysis for same-buffer writes, so consecutive TLOADs into the same tile could be emitted without pipe_barrier(PIPE_MTE2).
  • That is unsafe because same-pipe ordering still needs an explicit barrier when two MTE2 operations write the same local buffer.

Design

  • Always run WAW alias analysis first.
  • Drop the WAW dependency only when every WAW dependency pair is between two different direct pto.subview or memref.subview result values.
  • Keep same tile, same subview, root-vs-subview, and unknown provenance conservative.

Testing

  • Local: git diff --check
  • Local: ninja -C build-codex lib/PTO/Transforms/CMakeFiles/obj.PTOTransforms.dir/InsertSync/InsertSyncAnalysis.cpp.o
  • Note: full local ptoas build is currently blocked by local custom LLVM/VPTO environment mismatch (missing custom LLVMFloat*/SimtEntry symbols with llvm-project/build-shared, and missing mlir-tblgen in llvm-project-vpto/build-shared).

Risk / Rollback

  • Risk is limited to adding MTE2 pipe barriers for true same-buffer TLoad WAW hazards that were previously skipped.
  • Rollback by reverting this PR.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the WAW dependency exemption logic for TLoad operations on MTE2 pipelines, limiting the exemption to distinct direct subviews. The review feedback correctly identifies a potential correctness issue where nested subviews or subviews of different parent buffers could be incorrectly exempted, and provides a code suggestion to ensure they share the same parent buffer.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +200 to +206
static bool isDistinctDirectSubviewPair(const BaseMemInfo *lhs,
const BaseMemInfo *rhs) {
if (!lhs || !rhs) return false;
if (lhs->baseBuffer == rhs->baseBuffer) return false;
return isDirectSubviewBuffer(lhs->baseBuffer) &&
isDirectSubviewBuffer(rhs->baseBuffer);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of isDistinctDirectSubviewPair only checks if both buffers are defined by a subview operation and are distinct SSA values. However, this is insufficient and can lead to correctness issues in the following cases:

  1. Nested Subviews: If one subview is nested inside another (e.g., %view2 = pto.subview %view1), both are defined by SubViewOp and are distinct SSA values, so they would be incorrectly exempted from the WAW barrier even though they overlap.
  2. Different Parent Buffers: If they are subviews of different parent buffers that happen to alias, they are not sibling subviews under the same parent, so the "PTOAS subview non-overlap contract" does not apply.

To ensure correctness, we should verify that both subviews are sibling subviews of the exact same parent buffer by checking that their source operands (the first operand of the subview operation) are identical.

static bool isDistinctDirectSubviewPair(const BaseMemInfo *lhs,
                                        const BaseMemInfo *rhs) {
  if (!lhs || !rhs || !lhs->baseBuffer || !rhs->baseBuffer) return false;
  if (lhs->baseBuffer == rhs->baseBuffer) return false;
  Operation *lhsOp = lhs->baseBuffer.getDefiningOp();
  Operation *rhsOp = rhs->baseBuffer.getDefiningOp();
  if (!lhsOp || !rhsOp) return false;
  if (!isa<pto::SubViewOp, memref::SubViewOp>(lhsOp) ||
      !isa<pto::SubViewOp, memref::SubViewOp>(rhsOp)) {
    return false;
  }
  return lhsOp->getOperand(0) == rhsOp->getOperand(0);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is a real correctness hole. I tightened the exemption from "distinct direct subview values" to "distinct sibling subviews with the same direct source operand" so nested subviews, root-vs-view, different-parent aliases, and unknown provenance remain conservative and keep the MTE2 barrier. I also added a nested-subview regression in tload_waw_same_buffer_requires_mte2_barrier.pto. Updated in 15d2151.

@TaoTao-real TaoTao-real force-pushed the codex/fix-tload-waw-subview branch from fd89f60 to 15d2151 Compare June 22, 2026 02:58
@reedhecre

reedhecre commented Jun 22, 2026

Copy link
Copy Markdown

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix(sync): keep TLoad WAW barriers for same buffer #843 fix(sync): keep TLoad WAW barriers for same buffer
  • Author: TaoTao-real
  • Base/Head: main / codex/fix-tload-waw-subview
  • Head SHA: 15d21512c6c9
  • Trigger: 检测到新的 open PR
  • Generated At: 2026-06-22T05:06:35Z
  • Status: completed

Summary

PR #843 still leaves a real TLoad WAW hole: overlapping or otherwise unproven sibling subviews can still skip the required MTE2 barrier, so the fix is incomplete.

Findings

  1. P2 Sibling-subview exemption still suppresses real TLoad WAW barriers lib/PTO/Transforms/InsertSync/InsertSyncAnalysis.cpp:203

The new isDistinctSiblingSubviewPair() / isTLoadToTLoadWAWExempt() logic treats any two different subviews of the same direct parent as safe, but that is not an IR guarantee. pto.subview on slayout=none_box allows overlapping siblings, e.g. [0,0] size [16,128] and [0,64] size [16,128], and PTOIRTranslator also deliberately falls back to conservative aliasing when subview offsets/layout are not statically analyzable. In both situations wawDepVec represents a real or possible overlap, yet this predicate ignores the computed ranges and suppresses the PIPE_MTE2 barrier anyway. The result is that overlapping or dynamic sibling-subview TLoads can still be emitted without the WAW barrier this PR is trying to restore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants