Skip to content

[codex] Encode PTO v0.57 tile blocks in the Linx compiler#29

Merged
zhoubot merged 4 commits into
mainfrom
linx-isa-0.57
Jul 1, 2026
Merged

[codex] Encode PTO v0.57 tile blocks in the Linx compiler#29
zhoubot merged 4 commits into
mainfrom
linx-isa-0.57

Conversation

@zhoubot

@zhoubot zhoubot commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Implements the Linx compiler side of the PTO v0.57 block contract:

  • emits split B.ITP source and B.OTA destination tile descriptors
  • updates v0.57 TMA/CUBE/FIXP/TEPL opcode ids and merged-form handling
  • adds/updates PTO intrinsics and clang driver plumbing for the v0.57 TileOP API
  • adds FileCheck coverage for FIXP/TINSERT/TTRANS and tile block body lowering

Validation

  • ninja -C compiler/llvm/build-linxisa-clang clang llc -j8
  • llc -mtriple=linx64 -O2 < llvm/test/CodeGen/LinxISA/v057-fixp-tinsert-ttrans.ll | FileCheck llvm/test/CodeGen/LinxISA/v057-fixp-tinsert-ttrans.ll
  • SuperNPUBench full compile from the superproject: 49 LinxISA ELFs, 56 PTO ISA ELFs, 0 failure markers

Notes

v0.57 intentionally does not encode sync, communication, pipe lifecycle, or PTO IR-only operations. TGET_SCALE_ADDR, GET_VAL, and SET_VAL are not introduced as hardware instructions.

The compiler now exposes the v0.57 split tile descriptor model, updated PTO intrinsics, BSTART class handling, and lowering/printing paths needed by SuperNPUBench and the runtime probes.

Constraint: v0.57 uses B.ITP for sources and B.OTA for destinations instead of combined B.IOT descriptors

Rejected: Keep TSTORE_FP and similar forms as independent opcode ids | equivalent forms are selected by form bits under one encoding row

Confidence: high

Scope-risk: broad

Directive: Do not reintroduce encoding for sync, communication, pipe lifecycle, or PTO IR-only operations in the hardware block map

Tested: ninja -C compiler/llvm/build-linxisa-clang clang llc -j8

Tested: llc/FileCheck compiler/llvm/llvm/test/CodeGen/LinxISA/v057-fixp-tinsert-ttrans.ll

Tested: SuperNPUBench compile_all.sh all generated 49 LinxISA and 56 PTO ISA ELFs

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the LinxISA target in LLVM and Clang to support the LinxISA v0.57 specification, introducing new builtins, intrinsics, and the -mlxbc target flag, alongside updating the assembly parser, disassembler, and instruction printer to support new descriptor formats like B.ITP and B.OTA. Additionally, support for half-precision float types (f16/bf16) has been added. Feedback on these changes highlights a mismatch in LinxISABlockify.cpp where the generated v.sw.local instruction conflicts with the test expectations of v.sw.brg.local. There are also several style violations across LinxISABlockify.cpp and LinxISAISelDAGToDAG.cpp where tab characters were used instead of spaces, violating LLVM coding standards.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +1654 to +1658
" v.lw.local [ta, lc0<<2, lc1<<8], ->vt.w\n"
" v.lw.local [tb, lc0<<2, lc1<<8], ->vu.w\n"
" v.add vt#1.sw, vu#1.sw, ->vt.w\n"
" v.sw.local vt#2, [to, lc0<<2, lc1<<8]\n"
" C.BSTOP\n";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a mismatch between the generated assembly instruction v.sw.local and the test assertion in vpar-tile-binop-body.ll which expects v.sw.brg.local. Additionally, these lines are indented with tabs, which violates the LLVM Coding Standards (indentation must use spaces). Let's update the instruction to v.sw.brg.local and fix the indentation.

          "  v.lw.local [ta, lc0<<2, lc1<<8], ->vt.w\\n"\n          "  v.lw.local [tb, lc0<<2, lc1<<8], ->vu.w\\n"\n          "  v.add vt#1.sw, vu#1.sw, ->vt.w\\n"\n          "  v.sw.brg.local vt#2, [to, lc0<<2, lc1<<8]\\n"\n          "  C.BSTOP\\n";

Comment on lines +1688 to +1692
" v.lw.local [ta, lc0<<2, lc1<<8], ->vt.w\n"
" v.lw.local [tb, lc0<<2, lc1<<8], ->vu.w\n"
" v.sub vt#1.sw, vu#1.sw, ->vt.w\n"
" v.sw.local vt#2, [to, lc0<<2, lc1<<8]\n"
" C.BSTOP\n";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a mismatch between the generated assembly instruction v.sw.local and the test assertion in vpar-tile-binop-body.ll which expects v.sw.brg.local. Additionally, these lines are indented with tabs, which violates the LLVM Coding Standards (indentation must use spaces). Let's update the instruction to v.sw.brg.local and fix the indentation.

          "  v.lw.local [ta, lc0<<2, lc1<<8], ->vt.w\\n"\n          "  v.lw.local [tb, lc0<<2, lc1<<8], ->vu.w\\n"\n          "  v.sub vt#1.sw, vu#1.sw, ->vt.w\\n"\n          "  v.sw.brg.local vt#2, [to, lc0<<2, lc1<<8]\\n"\n          "  C.BSTOP\\n";

Comment on lines +2222 to +2225
case MVT::f16:
case MVT::bf16:
Opc = LinxISA::LHU_PCR;
break;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These lines are indented with tabs, which violates the LLVM Coding Standards. Let's use spaces for indentation instead.

      case MVT::f16:\n      case MVT::bf16:\n        Opc = LinxISA::LHU_PCR;\n        break;

Comment on lines +2388 to +2391
case MVT::f16:
case MVT::bf16:
Opc = LinxISA::SH_PCR;
break;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These lines are indented with tabs, which violates the LLVM Coding Standards. Let's use spaces for indentation instead.

      case MVT::f16:\n      case MVT::bf16:\n        Opc = LinxISA::SH_PCR;\n        break;

RuoyuZhou added 3 commits July 1, 2026 13:46
The v0.57 PR review found mismatched bridge-local vector stores and style drift in the PCR load/store selector cases. This keeps the emitted assembly in the Linx block shape expected by downstream FileCheck and descriptor tests.

Constraint: Review feedback requires vtile add/sub bodies to use bridge-local stores.

Confidence: high

Scope-risk: narrow

Tested: ninja -C build-linxisa-clang clang llc llvm-mc llvm-objdump -j8

Tested: llc/FileCheck llvm/test/CodeGen/LinxISA/vpar-tile-binop-body.ll

Tested: llc/FileCheck llvm/test/CodeGen/LinxISA/v057-fixp-tinsert-ttrans.ll
The v0.57 block contract uses absolute 8-bit TileReg ids with TZERO reserved as source-only id 0. Existing CodeGen coverage only checked that B.ITP and B.OTA appeared, so this adds exact-id checks for compiler-generated descriptors and MC coverage for the full explicit tile#255 namespace plus TZERO destination rejection.

Constraint: Keep this follow-up test-only after reviewing absolute TileReg allocation support.

Confidence: high

Scope-risk: narrow

Tested: git diff --check

Tested: llc v057-fixp-tinsert-ttrans.ll | FileCheck

Tested: llc vpar-tile-binop-body.ll | FileCheck

Tested: llvm-mc/llvm-objdump/FileCheck v057-tile-descriptor-roundtrip.s

Tested: not llvm-mc/FileCheck v057-tile-descriptor-errors.s

Not-tested: llvm-lit direct invocation; source-tree lit config is missing enable_profcheck in this checkout
The v0.57 descriptor surface names the 8-bit TileReg namespace directly as T0 through T255. The assembler now accepts that spelling for B.ITP/B.OTA, and the disassembler/codegen printer emits it as canonical output. T0 remains the source-only zero tile, so generated allocations continue to map compiler TILE0 to T1.

Constraint: v0.57 reserves encoded id 0 as the TZERO source-padding tile.

Constraint: LLVM currently allocates 32 compiler-managed physical tile registers and maps them into T1..T32.

Rejected: Treat T0 as a writable destination | conflicts with the existing v0.57 zero-tile contract and B.OTA validation.

Rejected: Expand LLVM register allocation to 256 physical tile registers in this patch | larger semantic change outside the direct reference syntax correction.

Confidence: high

Scope-risk: narrow

Directive: Do not make T0 destination-capable without updating the ISA contract, B.OTA validation, and downstream QEMU/GFSIM/model behavior together.

Tested: ninja -C build-linxisa-clang clang llc llvm-mc llvm-objdump FileCheck -j8

Tested: llc v057-fixp-tinsert-ttrans.ll piped to FileCheck

Tested: llvm-mc/llvm-objdump v057-tile-descriptor-roundtrip.s piped to FileCheck

Tested: not llvm-mc v057-tile-descriptor-errors.s piped to FileCheck
@zhoubot zhoubot merged commit 7035a42 into main Jul 1, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant