Skip to content

Introduce cache modifiers#158

Merged
mawad-amd merged 37 commits intomainfrom
muhaawad/cache-modifiers
Mar 22, 2026
Merged

Introduce cache modifiers#158
mawad-amd merged 37 commits intomainfrom
muhaawad/cache-modifiers

Conversation

@mawad-amd
Copy link
Collaborator

@mawad-amd mawad-amd commented Sep 11, 2025

Motivation

This PR introduces support for cache modifiers and volatile flags in Iris' distributed memory operations (load, store, get, put).
The goal is to provide fine-grained control over caching behavior at different cache levels. This allows developers to have fine grained control over caching behavoir.

Technical Details

Changes include:

  • Extended load with cache_modifier and volatile options.
  • Extended store with cache_modifier.
  • Extended get and put with both load_cache_modifier and store_cache_modifier.
  • Docstrings updated to document supported values and behavior.
  • Fixed error propagation in tests/run_tests_distributed.py to correctly forward worker exit codes.
  • Added new unit tests covering all cache modifier combinations and volatile option.

Test Plan

I added comprehensive tests under tests/unittests:

  • test_load_cache_modifiers.py – verifies iris.load with all cache modifiers and volatile settings.
  • test_store_cache_modifiers.py – verifies iris.store with all store cache modifiers across ranks.
  • test_get_cache_modifiers.py – verifies iris.get with load/store cache modifier combinations.
  • test_put_cache_modifiers.py – verifies iris.put with load/store cache modifier combinations.

Each test checks correctness by comparing against expected tensors across distributed ranks.

Test Result

See CI

Submission Checklist

mawad-amd and others added 3 commits September 10, 2025 17:42
Signed-off-by: Muhammad Awad <MuhammadAbdelghaffar.Awad@amd.com>
@mawad-amd mawad-amd requested a review from neoblizz as a code owner September 11, 2025 08:08
Copilot AI review requested due to automatic review settings September 11, 2025 08:08
@mawad-amd mawad-amd requested a review from BKP as a code owner September 11, 2025 08:08
@github-actions github-actions bot added in-progress We are working on it iris Iris project issue labels Sep 11, 2025
@mawad-amd mawad-amd changed the title Muhaawad/cache modifiers Introduce cache modifiers Sep 11, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds cache modifier support to the Iris framework's load and store operations. The changes enable fine-grained control over GPU cache behavior during memory operations, allowing developers to optimize performance by specifying how data should be cached across the CU (L1), L2, and LLC levels.

Key changes:

  • Enhanced load() and store() functions with cache_modifier and volatile parameters
  • Added comprehensive test coverage for different cache behaviors (default, writeback, cache global, cache streaming, write-through, and volatile)
  • Updated function documentation to explain cache modifier options and their effects on GPU cache hierarchy

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
iris/iris.py Enhanced load() and store() functions with cache_modifier and volatile parameters, plus detailed documentation
tests/unittests/test_load_cache_modifiers.py New comprehensive test suite covering all cache modifier behaviors and volatile flag

@mawad-amd mawad-amd marked this pull request as draft September 11, 2025 08:19
@mawad-amd mawad-amd marked this pull request as ready for review September 14, 2025 05:54
@mawad-amd mawad-amd requested a review from Copilot September 14, 2025 05:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

@neoblizz
Copy link
Member

neoblizz commented Feb 4, 2026

@copilot fix the following tests;

 =========================== short test summary info ============================
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[8-dtype0]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[8-dtype1]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[8-dtype2]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[16-dtype0]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[16-dtype1]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[16-dtype2]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[32-dtype0]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[32-dtype1]
  FAILED tests/unittests/test_get_other_triton.py::test_get_other_api[32-dtype2]

Copy link
Contributor

Copilot AI commented Feb 4, 2026

@neoblizz I've opened a new pull request, #357, to work on those changes. Once the pull request is ready, I'll request review from you.

@mawad-amd
Copy link
Collaborator Author

mawad-amd commented Feb 4, 2026

I don't remember if I fixed this or not, but cache modifiers do not work on remote accesses. Tests should not test that. Doc strings need to be updated. I don't think we can assert for that.

@neoblizz
Copy link
Member

neoblizz commented Feb 4, 2026

I don't remember if I fixed this or not, but cache modifiers do not work on remote accesses. Tests should not test that. Doc strings need to be updated. I don't think we can assert for that.

It still is an issue because if you do curr_rank -> curr_rank using iris.load, we should be able to use cache-modifiers there?

@mawad-amd
Copy link
Collaborator Author

I don't remember if I fixed this or not, but cache modifiers do not work on remote accesses. Tests should not test that. Doc strings need to be updated. I don't think we can assert for that.

It still is an issue because if you do curr_rank -> curr_rank using iris.load, we should be able to use cache-modifiers there?

Yes, correct. Local ops should work fine. Remote ops are not allowed to have cache modifiers. Something like this won't work and I was initially testing for it:
iris.load(to_rank=local_rank, from_rank=remote_rank, modifer="anything")

@mawad-amd
Copy link
Collaborator Author

@copilot Please do the following on this PR:

  1. Sync with main: Rebase or merge this branch with the latest main branch to resolve the merge conflicts.

  2. Update tests: Cache modifiers cannot be attached to remote stores. All APIs must respect this constraint. Update all tests so that no test case attempts to perform a remote cache-modified store. Remove or rework any test that does this.

  3. Ensure API consistency: If any API currently allows attaching cache modifiers to remote stores, update it to prevent that — cache modifiers are local-only.

Copy link
Contributor

Copilot AI commented Mar 13, 2026

@mawad-amd I've opened a new pull request, #451, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
@mawad-amd
Copy link
Collaborator Author

@copilot resolve the conflict with main.

Copy link
Contributor

Copilot AI commented Mar 13, 2026

@mawad-amd I've opened a new pull request, #452, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: Muhammad Awad <112003944+mawad-amd@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Ryan Swann <109695074+ryanswann-amd@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Muhammad Osama <osama94@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: JoseSantosAMD <87447437+JoseSantosAMD@users.noreply.github.com>
@mawad-amd
Copy link
Collaborator Author

@copilot This branch still has merge conflicts with main. Please merge the latest main branch into this branch and resolve all conflicts.

Copy link
Contributor

Copilot AI commented Mar 13, 2026

@mawad-amd I've opened a new pull request, #453, to work on those changes. Once the pull request is ready, I'll request review from you.

Resolve all 11 conflicts in iris/iris.py by keeping the cache-modifiers
branch version which adds cache_modifier, other, and volatile parameters
to load/store/copy/get/put APIs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mawad-amd
Copy link
Collaborator Author

@copilot This PR needs to be brought to a mergeable state.

  1. Merge main into this branch and resolve conflicts. Keep the cache-modifiers branch changes.

  2. Investigate: do cache modifiers work on remote (cross-GPU) memory operations? Write a small multi-rank test that applies cache modifiers (e.g. .cv, .cs on loads, .wt, .wb on stores) to remote operations and checks for correctness. Compare against the same ops without modifiers. Document your findings — we need to know if modifiers on remote ops silently produce wrong results, hang, or work fine.

  3. Inspect the generated assembly. For each cache modifier (.cv, .cs, .wt, .wb), compile a small kernel and dump the generated ISA (use TRITON_CACHE_DIR or triton.compile). Verify that the assembly instructions actually reflect the requested modifier — e.g. global_load with the correct glc/slc/dlc bits, global_store with glc/slc etc. Report back what you find. We need to confirm Triton is actually emitting the right instructions.

  4. Remove runtime guards around cache_modifier. The store() function currently has an if-statement that skips cache_modifier for remote stores. Remove it — always pass cache_modifier through to tl.store() and tl.load() unconditionally in all functions (load, store, copy, get, put). No branching on rank equality for cache modifiers.

  5. Document cache modifier behavior in docstrings based on your findings from steps 2 and 3. For each function, clearly state which side is local vs remote and where modifiers are safe:

    • load(pointer, to_rank, from_rank): load is local when to_rank == from_rank, remote otherwise.
    • store(pointer, value, from_rank, to_rank): store is local when from_rank == to_rank, remote otherwise.
    • copy(src, dst, from_rank, to_rank): load is from from_rank, store is to to_rank. Either side can be local or remote.
    • get(from_ptr, to_ptr, from_rank, to_rank): load is remote, store is always local.
    • put(from_ptr, to_ptr, from_rank, to_rank): load is always local, store is remote.

    State whether cache modifiers on remote operations work, and that it is the caller's responsibility to use them appropriately.

  6. Fix all tests. Update any test that uses cache modifiers on remote operations if your investigation shows they don't work. Make sure every test passes.

You are running on a self-hosted runner with AMD GPUs. Run the full unittest suite (tests/) and verify everything passes before marking this done.

Copy link
Contributor

Copilot AI commented Mar 21, 2026

@mawad-amd I've opened a new pull request, #469, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 3 commits March 21, 2026 00:36
Co-authored-by: Michael Melesse <micmelesse@gmail.com>
Co-authored-by: Ahmet Artu Yildirim <ahmet.yildirim@amd.com>
Co-authored-by: Ahmet Yildirim <ayildiri@amd.com>
Co-authored-by: jonathou-amd <Jonathan.Ou@amd.com>
Co-authored-by: jonathou-amd <jonathou@amd.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
# Conflicts:
#	iris/experimental/iris_gluon.py
#	iris/iris.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mawad-amd mawad-amd merged commit 144f1b9 into main Mar 22, 2026
138 of 140 checks passed
@mawad-amd mawad-amd deleted the muhaawad/cache-modifiers branch March 22, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in-progress We are working on it iris Iris project issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants