[CuTe,Fwd,Sm100] refactor mla sm100 forward and add page table by jayhshah · Pull Request #2558 · Dao-AILab/flash-attention

jayhshah · 2026-05-13T00:42:39Z

Superseding #2468.

We refactor the kernel file to remove duplication around splitting V and add page table support.

We also support not providing Q and K (which semantically mean q_pe and k_pe). This computes attention according to the formula

O = softmax(scale * (Qv @ V.T)) @ V

as in Deepseek v4 core attention.

jayhshah · 2026-05-13T00:50:35Z

            ),
            mCuBlockIdxOffsets=(
-                blocksparse_tensors.cu_block_idx_offsets if blocksparse_tensors is not None else None
+                blocksparse_tensors.cu_block_idx_offsets


This change is just to fix the linter error.

jayhshah · 2026-05-13T02:21:53Z

Note that 64k test in benchmark script throws IMA on cutlass 4.5.0 in the refactor, but this is due to issue identified here: NVIDIA/cutlass#3208

jayhshah · 2026-05-13T02:43:33Z

Benchmark shows ~15% dropoff for paged using cp async loads.

benchmark_mla_paged.txt

refactor mla sm100 forward

5885f40

jayhshah requested a review from Johnsonms May 13, 2026 00:46

jayhshah commented May 13, 2026

View reviewed changes

jayhshah force-pushed the jshah/mla-paged-kv-refactor branch from e4e954f to 0705919 Compare May 13, 2026 01:29

add benchmark; address deprecation warnings; tweak ptx gemm dispatch

9eaec0a

jayhshah force-pushed the jshah/mla-paged-kv-refactor branch from 0705919 to 9eaec0a Compare May 13, 2026 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CuTe,Fwd,Sm100] refactor mla sm100 forward and add page table#2558

[CuTe,Fwd,Sm100] refactor mla sm100 forward and add page table#2558
jayhshah wants to merge 2 commits into
mainfrom
jshah/mla-paged-kv-refactor

jayhshah commented May 13, 2026 •

edited

Loading

Uh oh!

jayhshah May 13, 2026

Uh oh!

jayhshah commented May 13, 2026

Uh oh!

jayhshah commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jayhshah commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayhshah May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jayhshah commented May 13, 2026

Uh oh!

jayhshah commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jayhshah commented May 13, 2026 •

edited

Loading