Add inline asm support for amdgpu by Flakebi · Pull Request #149793 · rust-lang/rust

Flakebi · 2025-12-08T23:32:37Z

Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target).
Add register classes for vgpr (vector general purpose register) and sgpr (scalar general purpose register).
The LLVM backend supports two more classes, reg, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it.
The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class.

There is one change affecting general inline assembly code, that is InlineAsmReg::name() now returns a Cow instead of a &'static str. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name.

Tracking issue: #135024

rustbot · 2025-12-08T23:32:42Z

r? @eholk

rustbot has assigned @eholk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-12-09T08:23:14Z

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

eholk · 2025-12-09T23:58:51Z

This seems okay to me, but I'd rather someone more familiar with this part of the compiler give the final signoff.

@bors r?

rustbot · 2025-12-09T23:58:53Z

Error: Parsing assign command in comment failed: ...'' | error: specify user to assign to at >| ''...

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #triagebot on Zulip.

eholk · 2025-12-09T23:59:07Z

@bors r? compiler

fee1-dead · 2025-12-10T13:51:19Z

@rustbot reroll

tests/assembly-llvm/asm/amdgpu-types.rs

Flakebi · 2025-12-14T15:07:11Z

Removed return type from tests to fix conflict with #149991, which starts disallowing returns in gpu-kernel functions.

chenyukang · 2025-12-19T00:37:11Z

The change seems Ok, i'd like people with more background to take a look.
@rustbot reroll

jdonszelmann · 2026-01-06T01:59:31Z

That's not me (sorry it took me a while because of holidays). But iirc that could be amanieu? r? @Amanieu

rust-bors · 2026-01-09T15:50:05Z

☔ The latest upstream changes (presumably #150866) made this pull request unmergeable. Please resolve the merge conflicts.

Flakebi · 2026-01-09T17:23:28Z

@taiki-e, I saw your mention in the asm tracking issue. amdgpu has named registers (>384 of them), so a clobber_abi would be possible. However, there’s currently no stable ABI, so we would need to follow what the LLVM backend does and I think there’s no tools to make sure the Rust and LLVM ABI understanding is in sync, so that would be asking for subtle breakages. Therefore I think it doesn’t make sense to support clobber_abi with amdgpu at the moment.

Flakebi · 2026-01-14T10:23:13Z

Rebased to fix conflict, no other changes

Edit: kdiff3 didn’t like 🦀 when fixing conflicts 😢, fixed now

wesleywiser · 2026-02-19T15:29:50Z

Hey @Amanieu, are you able to take a look at this PR? Your domain expertise would be really valuable in the review process 🙂

compiler/rustc_codegen_llvm/src/asm.rs

Amanieu · 2026-02-19T22:19:50Z

compiler/rustc_target/src/asm/amdgpu.rs

+    }
+
+    // There are too many conflicts to list
+    pub fn overlapping_regs(self, mut _cb: impl FnMut(AmdgpuInlineAsmReg)) {}


This still needs to be implemented for correctness, otherwise it might be possible to cause an LLVM assert or crash from rust code.

Ah, I misinterpreted overlapping_regs before as most targets explicitly list all conflicts between all registers and I estimate it would take some work and memory to store those for amdgpu (164 kiB if I did the math correctly).
Now I see it’s actually just the conflicts for a single register, which should be 30 registers at max.

Edit: Oh, wait, probably more as there’s SIMD registers (see my reply on your next comment)

Amanieu · 2026-02-19T22:44:35Z

compiler/rustc_target/src/asm/amdgpu.rs

+        sgpr,
+        vgpr,


If my understanding is correct, a single gpr can only hold 32 bits and you need pairs to get 64-bit values.

So really, the register classes here should be more like the nvptx ones, with separate classes for single registers, pairs and half registers since they all support different kinds of types.

I would expect something like:

vgpr{16,32,64}

sgpr{16,32,64}

Maybe with 128-bit variants if i128 support is really needed.

Hm, they can be used as SIMD/vector registers as well (I added support in #149994), up to [32 x i32].
So, the types would be something like vgpr{16,32,64,96,128,160,192,224,256,288,320,352,384,416,448,480,512,544,576,608,640,672,704,736,768,800,832,864,896,928,960,992,1024}. (a few might not be valid, but the LLVM backend is moving in a direction to support more and more of these)

x86 xmm registers seem to support multiple sizes for a single register class, so maybe amdgpu registers can be modeled like them?

It seems like LLVM and Rust are somewhat forgiving with non-matching types?
E.g. I can assign an i32 to a 16-bit register in x86 (compiles fine though obviously the upper 16-bit of y are garbage afterwards):

let x: i32 = 5; let y: i32; unsafe { asm!("mov cx, ax", lateout("cx") y, in("ax") x); }

Similarly, assigning an i64 to a 32-bit amdgpu sgpr seems to compile fine as well

check_reg!(s0_i642 i64 "s0" "s_mov_b32");

It might compile but the generated asm is definitely incorrect, and this may cause LLVM assert failures in the future even if it doesn't today.

The reason xmm supports multiple sizes is because fundamentally xmm0, ymm0 and zmm0 are just different names for the same register. This is not the case for amdgpu since v0 and v[0:1] are very distinct, for example the latter overlaps v1 but the former doesn't. So these really need to be separate register classes.

compiler/rustc_target/src/asm/amdgpu.rs

Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target). Add register classes for `vgpr` (vector general purpose register) and `sgpr` (scalar general purpose register). The LLVM backend supports two more classes, `reg`, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it. The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class. There is one change affecting general inline assembly code, that is `InlineAsmReg::name()` now returns a `Cow` instead of a `&'static str`. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name.

rustbot · 2026-02-20T09:08:04Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Flakebi · 2026-02-20T09:17:09Z

Thanks for the review!
I implemented overlapping_regs, it should return about 24 kiB for very large registers, that still seems reasonable.
I realize that the vector type PR overtook this one, so vector types in inline asm is in a half-baked state. Most of the implementation should be generic enough to support them, though the possible types are not exposed and there are no tests yet.
Let me know if you prefer adding vector type support in a separate PR after this one.

rustbot assigned eholk Dec 8, 2025

Flakebi mentioned this pull request Dec 8, 2025

Tracking Issue for amdgcn target #135024

Open

26 tasks

This comment has been minimized.

Sign in to view

Flakebi force-pushed the inline-asm branch from def10ff to b46e3b6 Compare December 9, 2025 08:23

eholk approved these changes Dec 9, 2025

View reviewed changes

rustbot assigned fee1-dead and unassigned eholk Dec 9, 2025

rustbot assigned chenyukang and unassigned fee1-dead Dec 10, 2025

ZuseZ4 reviewed Dec 10, 2025

View reviewed changes

tests/assembly-llvm/asm/amdgpu-types.rs Outdated Show resolved Hide resolved

Flakebi force-pushed the inline-asm branch 2 times, most recently from bdb726b to 9db5dca Compare December 14, 2025 15:07

rustbot assigned jdonszelmann and unassigned chenyukang Dec 19, 2025

rustbot assigned Amanieu and unassigned jdonszelmann Jan 6, 2026

taiki-e mentioned this pull request Jan 9, 2026

Tracking Issue for asm_experimental_arch #93335

Open

11 tasks

Flakebi force-pushed the inline-asm branch from 9db5dca to 37beb7f Compare January 14, 2026 10:22

This comment has been minimized.

Sign in to view

Flakebi force-pushed the inline-asm branch from 37beb7f to d6e7c7d Compare January 14, 2026 20:56

Amanieu reviewed Feb 19, 2026

View reviewed changes

Flakebi force-pushed the inline-asm branch from d6e7c7d to 2bcb72e Compare February 20, 2026 09:08

Uh oh!

Comments

Conversation

Flakebi commented Dec 8, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Dec 8, 2025

Uh oh!

This comment has been minimized.

rustbot commented Dec 9, 2025

Uh oh!

eholk commented Dec 9, 2025

Uh oh!

rustbot commented Dec 9, 2025

Uh oh!

eholk commented Dec 9, 2025

Uh oh!

fee1-dead commented Dec 10, 2025

Uh oh!

Uh oh!

Flakebi commented Dec 14, 2025

Uh oh!

chenyukang commented Dec 19, 2025

Uh oh!

jdonszelmann commented Jan 6, 2026

Uh oh!

rust-bors bot commented Jan 9, 2026

Uh oh!

Flakebi commented Jan 9, 2026

Uh oh!

This comment has been minimized.

Flakebi commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

wesleywiser commented Feb 19, 2026

Uh oh!

Uh oh!

Amanieu Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Flakebi Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Amanieu Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Flakebi Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Amanieu Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rustbot commented Feb 20, 2026

Uh oh!

Flakebi commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Flakebi commented Dec 8, 2025 •

edited by rustbot

Loading

Flakebi commented Jan 14, 2026 •

edited

Loading

Flakebi Feb 20, 2026 •

edited

Loading