-
Notifications
You must be signed in to change notification settings - Fork 21
Pull requests: AMD-AGI/Primus-Turbo
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Attention][Parallelism] Ulysses context-parallel Varlen attention support
#377
opened Jun 10, 2026 by
paulpak58
Contributor
Loading…
5 of 12 tasks
opt: mxfp4 and mxfp8 dequantize kernel
#376
opened Jun 10, 2026 by
RuibinCheung
Collaborator
Loading…
5 of 12 tasks
[Op][Normalization] Add zero_centered_gamma to RMSNorm
#375
opened Jun 10, 2026 by
paulpak58
Contributor
Loading…
5 of 12 tasks
feat: support build on gfx1250
#374
opened Jun 9, 2026 by
RuibinCheung
Collaborator
Loading…
6 of 12 tasks
feat: add mxfp8 grouped quantize api
#373
opened Jun 9, 2026 by
RuibinCheung
Collaborator
Loading…
4 of 12 tasks
opt(gemm): add AITER MXFP4 preshuffle fast path
#366
opened Jun 6, 2026 by
jasainio
Contributor
Loading…
7 of 12 tasks
feat(quantize): enable stochastic rounding on MXFP4 gradients
#365
opened Jun 6, 2026 by
jasainio
Contributor
Loading…
6 of 12 tasks
feat(quantize): add fused FP8 quantization kernels with amax+scale and cast+transpose
#364
opened Jun 6, 2026 by
jasainio
Contributor
Loading…
6 of 12 tasks
[feat] flydsl based fp8 per tensor gemm
#356
opened Jun 3, 2026 by
kyle-256
Collaborator
Loading…
7 of 12 tasks
feat: triton_grouped_gemm: add work-stealing variant with ws_mode API
#353
opened Jun 2, 2026 by
wenchenvincent
Loading…
4 of 12 tasks
[feat] Add mxfp8 triton grouped gemm support
#349
opened May 30, 2026 by
kyle-256
Collaborator
Loading…
7 of 12 tasks
feat: ck_grouped_gemm: add work-stealing variant with ws_mode API
#348
opened May 27, 2026 by
wenchenvincent
Loading…
4 of 12 tasks
feat: update gemm tensorwise default backend on gfx950
#347
opened May 27, 2026 by
RuibinCheung
Collaborator
Loading…
5 of 12 tasks
chore: remove ck tensorwise pytest skip
#334
opened May 9, 2026 by
RuibinCheung
Collaborator
Loading…
4 of 12 tasks
[WIP] [Feature] Add Turbo MXFP8 Grouped GEMM (gfx950) for MoE
#330
opened May 7, 2026 by
kyle-256
Collaborator
Loading…
6 of 12 tasks
feat: add more activation func
#329
opened May 7, 2026 by
RuibinCheung
Collaborator
Loading…
8 of 9 tasks
opt(gemm): add hipBLASLt algorithm cache and thread-local workspace
#321
opened Apr 30, 2026 by
jasainio
Contributor
Loading…
6 of 12 tasks
Refactor: moe dispatch combine autotune
#312
opened Apr 24, 2026 by
zhenhuang12
Collaborator
Loading…
7 of 12 tasks
feat: enable hybrid FP8 dtypes on Triton grouped GEMM backends
#288
opened Apr 15, 2026 by
sarthak-amd
•
Draft
perf: optimize hipBLASLt grouped GEMM with algo tuning, enable grouped_gemm autotune hipblaslt support
#284
opened Apr 14, 2026 by
kyle-256
Collaborator
Loading…
feat(benchmark): per-model/GPU batch sizes and vocab projection for GEMM bench
#265
opened Mar 31, 2026 by
Z-Y00
Loading…
refactor: reorganize moe ops and kernels
#243
opened Mar 5, 2026 by
zhenhuang12
Collaborator
Loading…
ProTip!
Follow long discussions with comments:>50.