Skip to content

feat(tilelang): Add End2End tilelang profile backend and example#241

Open
zihaomu wants to merge 7 commits into
AMD-AGI:mainfrom
zihaomu:pr3-clean
Open

feat(tilelang): Add End2End tilelang profile backend and example#241
zihaomu wants to merge 7 commits into
AMD-AGI:mainfrom
zihaomu:pr3-clean

Conversation

@zihaomu
Copy link
Copy Markdown

@zihaomu zihaomu commented May 23, 2026

Summary

This PR is a follow-up to #215 and completes the TileLang profiling path in GEAK.

#215 adds the basic TileLang backend support, while this PR focuses on the missing TileLang-specific profiling workflow. With this change, GEAK can run TileLang workloads through an end-to-end correctness/profile/benchmark flow, collect profiling signals in ROCm environments, and use those signals to guide optimization.

This PR also adds a controlled TileLang GEMM optimization example. In our experiment, the TileLang GEMM example achieved about 1.5x speedup with GEAK.

What changed

  • Add end-to-end TileLang profiling support for GEAK.
  • Add TileLang ROCm profiling policy and fallback handling.
  • Add legacy rocprof profiling support for TileLang workloads.
  • Harden TileLang profile degradation behavior so profiling failures do not break the whole optimization flow.
  • Fix structured command profile targets for TileLang profiling.
  • Add a controlled examples/tilelang_gemm task, including:
    • target TileLang GEMM kernel
    • official reference kernel
    • correctness/profile/benchmark harness
    • GEAK-compatible task/config files
    • documentation for running the example manually and with GEAK
  • Keep profile guidance tests independent of the TileLang backend availability.

Dependency

This PR should be reviewed and merged after #215.

#215 introduces the TileLang backend integration. This PR builds on top of that work by adding the complete TileLang profiling support and an end-to-end TileLang GEMM optimization example.

Validation

The new TileLang GEMM example supports:

  cd examples/tilelang_gemm
  python test_tilelang_gemm_harness.py --correctness
  python test_tilelang_gemm_harness.py --full-benchmark --iterations 2 --warmup 1

  Expect output:

  CORRECTNESS: PASS
  GEAK_RESULT_LATENCY_MS=...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants