Refactor attention functions and add autotuning support in Triton by LoserCheems · Pull Request #288 · HKUSTDial/flash-sparse-attention

LoserCheems · 2026-05-12T06:53:07Z

No description provided.

…nsors for backward saving

…ton operations

- Introduced `is_autotune` parameter across various functions in `flash_sparse_attn` to enable kernel autotuning. - Updated backward, forward, and decode functions to conditionally use autotuned kernels or default kernels based on the `is_autotune` flag. - Adjusted launch configuration parameters for autotuning, setting default tile sizes and warp configurations when autotuning is enabled. - Enhanced function signatures and documentation to reflect the new `is_autotune` and `skip_checks` parameters for improved performance tuning.

…ng options

LoserCheems added 5 commits May 11, 2026 18:56

Refactor attention functions to use original query, key, and value te…

6e99bc8

…nsors for backward saving

Add autotuning support for dense and sparse kernels in Triton operations

241bc03

Add autotuning configurations for forward and backward kernels in Tri…

2576b4e

…ton operations

Update benchmark tests to use random input source and enable autotuni…

08b4cbe

…ng options

LoserCheems merged commit 52a6f4e into main May 12, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor attention functions and add autotuning support in Triton#288

Refactor attention functions and add autotuning support in Triton#288
LoserCheems merged 5 commits into
mainfrom
quant

LoserCheems commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LoserCheems commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant