-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Export kernel descriptor for amdgpu kernels #135909
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
Could not assign reviewer from: |
rustbot has assigned @petrochenkov. Use |
I wonder if we should require I'm not sold on that tho'. Well, in any case, can you add a comment about that? (or implement it, either one) |
@rustbot label +S-waiting-on-author -S-waiting-on-review |
Requiring Independent of that, this PR needs a test once the target is merged. |
cool. |
blocked on #134740 |
Congratulations! It landed! Your reward is more toil. Alas, the futility of humankind. :^) |
Hehe 😄 I added the test that I mentioned in the PR description (and forgot about before). Diff |
This comment has been minimized.
This comment has been minimized.
This PR modifies cc @jieyouxu |
Added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine modulo a thing
And jieyouxu being happy ofc |
Thanks for the quick review, I fixed both comments (diff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny test nit, test LGTM otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One test nit re. a bit more context then r=me on the test. You can also flip this PR to mark as ready-for-view via @rustbot ready
.
The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a `.kd` suffix. Add it to the exported symbols in the linker script, so that it can be found.
@workingjubilee to double-check, are you happy w/ the current iteration of changes? |
Yes. @bors r=jieyouxu,workingjubilee |
Rollup of 5 pull requests Successful merges: - rust-lang#135797 (Import initial generated 1.85 relnotes) - rust-lang#135909 (Export kernel descriptor for amdgpu kernels) - rust-lang#136545 (nvptx64: update default alignment to match LLVM 21) - rust-lang#137092 (abi_unsupported_vector_types: say which type is the problem) - rust-lang#137097 (Ignore Self in bounds check for associated types with Self:Sized) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#135909 - Flakebi:amdgpu-kd, r=jieyouxu,workingjubilee Export kernel descriptor for amdgpu kernels The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a `.kd` suffix. Add it to the exported symbols in the linker script, so that it can be found. For reference, the symbol is created here in LLVM: https://github.com/llvm/llvm-project/blob/d5457e4c1619e5dbeefd49841e284cbc24d35cb4/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp#L966 I wrote [a test](Flakebi@6a9115b) for this as well, I’ll add that once the target is merged and working. With this, all PRs to get working code for amdgpu are open (this + the target + the two patches adding addrspacecasts for alloca and global variables). Tracking issue: rust-lang#135024 r? `@workingjubilee`
The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a
.kd
suffix.Add it to the exported symbols in the linker script, so that it can be found.
For reference, the symbol is created here in LLVM: https://github.com/llvm/llvm-project/blob/d5457e4c1619e5dbeefd49841e284cbc24d35cb4/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp#L966
I wrote a test for this as well, I’ll add that once the target is merged and working.
With this, all PRs to get working code for amdgpu are open (this + the target + the two patches adding addrspacecasts for alloca and global variables).
Tracking issue: #135024
r? @workingjubilee