Add MGATHER.CAS atomic compare-and-swap gather instruction#133
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds documentation for the new MGATHER.CAS (Atomic Compare-and-Swap Gather from Memory to Tile) instruction in both English and Chinese, updating the TMA block headers and adding detailed specification files. The reviewer feedback suggests clarifying that the Row parameter is hardware-derived rather than a mandatory user-specified input in the parameter tables, and replacing the undefined type T with DataType in the execution model pseudo-code for both language versions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Introduce a new tileblock memory instruction that performs per-element atomic compare-and-swap from sparse memory (base + offset Tile) into a destination Tile, returning the pre-CAS memory value. Registers TileOp slot 8 in tma_block/header.md and ships EN/ZH instruction docs. Naming follows the MGATHER.MASK / MSCATTER.MASK dot-suffix convention to keep load-shape (GATHER) and per-element op (CAS) visually separated and leave room for future atomic RMW siblings. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8e3067b to
56bd137
Compare
|
@gemini-code-assist Thanks for the review! Here's the response to your feedback: 1. Row parameter Optional column — Fixed ✅ 2. Undefined type T in C pseudo-code — Not changing per team convention Note on CI lint failure: The lint failure in |
|
Thanks for the updates and the clarification. The documentation changes look good and align well with the existing conventions. I understand the rationale regarding the pseudo-code style and the pre-existing CI lint issue. Everything seems in order. |
Summary
MGATHER.CAS, a new tileblock memory instruction that performs per-element atomic compare-and-swap from sparse memory (base + offset Tile) into a destination Tile, returning the pre-CAS memory value on every lane so callers can detect failure and retry.tma_block/header.mdand ship EN/ZH instruction docs covering assembly syntax, parameter table, BSTART.TMA / B.DIM / B.IOT / B.IOR encoding expansion, and C execution model.MGATHER.MASK/MSCATTER.MASKdot-suffix convention, keeping the load shape (GATHER) and per-element op (CAS) visually separated and leaving room for future atomic RMW siblings (XCHG / ADD / MIN / MAX).Test plan
PadValuefill outside the valid region — match the intended atomic semantics.tma_block/header.md(EN + ZH) to the newMGATHER.CAS.mdresolve in the rendered docs site.