Skip to content

sort/topk: THISTOGRAM radix-select top-K (gfrun+gfsim PASS)#13

Open
liujiang833 wants to merge 2 commits into
mainfrom
topk-thistogram-pr
Open

sort/topk: THISTOGRAM radix-select top-K (gfrun+gfsim PASS)#13
liujiang833 wants to merge 2 commits into
mainfrom
topk-thistogram-pr

Conversation

@liujiang833

Copy link
Copy Markdown
Collaborator

Summary

Reimplements sort/topk as a pure tile-op radix-select using the new THISTOGRAM tile instruction, replacing the old SIMT per-bucket histogram scans. Runs end-to-end on both LinxCoreModel simulators.

Stacked on #12 (control pure-tile rewrite) — it reuses linx_print.h (linxi_put) and the tracked include/ tile-op API from that PR. Base this PR on control-puretile-hashtable; retarget to main once #12 merges.

Approach (radix-select the K-th-largest value)

  • Phase 1 — cumulative high-byte (Byte1) histogram over all 131072 inputs via THISTOGRAM (no filter) → high byte of the K-th largest.
  • Phase 3 — cumulative low-byte (Byte0) histogram via THISTOGRAM, filtered to high == kth_bin (the idx tile supplies the high-byte prefix filter) → low byte.
  • threshold = (kth_bin<<8 | low8) is, by definition, the value of the K-th largest element.
  • O(1) verification: g_expected is sorted descending, so g_expected[kTopK-1] is the K-th largest — compare it to threshold. (The previous O(kTopK²) host sort made the kernel infeasible on the cycle-accurate model.)

topk.hpp uses a self-contained THISTOGRAM_FIXED inline-asm wrapper with correct operand numbering, so it does not depend on the toolchain-bundled THISTOGRAM template (which has an off-by-one operand bug).

Validation

  • gfrun (functional): exit 0, ~47s.
  • gfsim (cycle-accurate): exit 0 with -s core.singleTierMode=true, ~89s; threshold = 0xfc10 == g_expected[2047].

Requires (model side)

THISTOGRAM is a new tile op — needs LinxCoreModel support (THISTOGRAM tile op + the ADDTPC full-PC and srcRType encoding fixes). Tracked in the companion LinxCoreModel issue.

🤖 Generated with Claude Code

@liujiang833

Copy link
Copy Markdown
Collaborator Author

Model-side changes tracked in LinxISA/LinxCoreModel#21 (THISTOGRAM tile op + ADDTPC full-PC fix + srcRType encoding fix).

@liujiang833

Copy link
Copy Markdown
Collaborator Author

Correction: model-side changes are consolidated in LinxISA/LinxCoreModel#20 (THISTOGRAM modeling + the ADDTPC full-PC and srcRType fixes). #21 was closed as a duplicate.

Reimplement topk with the THISTOGRAM tile op instead of SIMT per-bucket scans:
- topk.hpp: cumulative byte histograms over all 131072 inputs via THISTOGRAM
  (Byte1 high pass, then Byte0 low pass filtered to high==kth_bin via the idx
  tile). Self-contained THISTOGRAM_FIXED inline-asm wrapper with correct operand
  numbering (the toolchain-bundled template has an off-by-one operand bug).
- topk.cpp: radix-select the K-th-largest value threshold from the two
  histograms; verify O(1) against g_expected[kTopK-1] (the K-th largest by
  construction); output via linxi_put, result in exit code. (The previous
  O(kTopK^2) host sort made the kernel infeasible on the cycle-accurate model.)

Validated: gfrun exit 0 (~47s); gfsim exit 0 with -s core.singleTierMode=true
(~89s); threshold 0xfc10 == g_expected[2047]. Requires LinxCoreModel THISTOGRAM
support (THISTOGRAM tile op + ADDTPC/srcRType fixes) -- see LinxCoreModel issue.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…apper)

The tileop-api is owned by the compiler. Remove topk's self-contained
THISTOGRAM_FIXED inline-asm copy and call the standard THISTOGRAM wrapper
from tileop-api directly.

This makes topk depend on the upstream fix for tileop-api's THISTOGRAM
off-by-one operand numbering (template_asm.hpp), tracked in LinxISA/llvm-project
-> topk is BLOCKED on upstream until that fix ships.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@liujiang833

Copy link
Copy Markdown
Collaborator Author

Blocked on upstream: topk now calls the compiler tileop-api's standard THISTOGRAM wrapper (self-contained THISTOGRAM_FIXED copy removed — the compiler owns tileop-api). It depends on the upstream fix for the THISTOGRAM off-by-one operand numbering, tracked in LinxISA/llvm-project#27 (item 1). Marked draft until that fix ships upstream.

@liujiang833 liujiang833 marked this pull request as ready for review June 27, 2026 08:55
@liujiang833

Copy link
Copy Markdown
Collaborator Author

Unblocked: the THISTOGRAM operand fix is merged upstream (LinxISA/llvm-project#27, Linx-TileOP-API@75eaf77). Verified end-to-end on the rebuilt toolchain — topk compiles with the standard tileop-api THISTOGRAM wrapper (no self-contained copy), gfrun exit 0 (R2=0, threshold==expected) and gfsim exit 0 (-s core.singleTierMode=true). Marking ready for review.

Base automatically changed from control-puretile-hashtable to main June 29, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants