Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

NVPTX: Add f16 SIMD intrinsics #1626

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

kjetilkjeka
Copy link
Contributor

This implements parts of the f16x2 intrinsics as discussed in rust-lang/rust#125440 (comment)

It's unfortunately not possible to test that these intrinsics produce the correct instructions with the current test infrastructure. This was discussed on zulip

I have created assembly tests in my own branch of rust-lang repo that verifies the correct instructions are used. I have added #[assert_instr] attributes nonetheless. I'm not sure if it's desirable to follow up on rust-lang with a PR including the tests or not?

@rustbot
Copy link
Collaborator

rustbot commented Aug 7, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) some time within the next two weeks.

@kjetilkjeka kjetilkjeka force-pushed the nvptx_f16_simd_intrinsics branch 3 times, most recently from 5d937f5 to dd7e165 Compare August 12, 2024 13:24
@kjetilkjeka kjetilkjeka force-pushed the nvptx_f16_simd_intrinsics branch from dd7e165 to 21ff034 Compare August 12, 2024 16:43
@kjetilkjeka
Copy link
Contributor Author

kjetilkjeka commented Aug 12, 2024

Thanks for reviewing!

I have fixed the two comments from @Amanieu

I have circumvented the naming issue around min/minimum brought up by @CryZe by instead implementing the intrinsics that makes sense to have named as min/max. I can add the minimum/min_nan and maximum/max_nan when the correct naming is confirmed (either here or as follow up).

I have also re-run the tests with my branch containing FileCheck based tests that the correct intrinsics are produced.

I see that there might be yet another slight naming issue. I have named the functions as f16x2_<intrinsic> while the intrinsic (in PTX) is named <intrinsic>.f16x2. What is the desired naming convention here?

@Amanieu
Copy link
Member

Amanieu commented Aug 12, 2024

CUDA seems to have an API for f16 intrinsics, should we be mirroring that API instead?

@kjetilkjeka
Copy link
Contributor Author

CUDA seems to have an API for f16 intrinsics, should we be mirroring that API instead?

I see that the names are consistent with the ptx ISA for functions. I will change all the function names to be consistent with this naming (both CUDA and PTX ISA). I will refrain from adopting the leading double underscore __ as this is a C detail related to not having namespaces. I'm very happy with this solution.

The part I'm a bit more unsure about is naming the type half2. The CUDA C type is, at least in one aspect, closer to something found in core::simd as a big part of the problem it solves is being able to use the same vector instructions both on x86_64 and nvptx64 targets. These, core::arch::nvptx types are only usable on ptx and if it is acceptable to keep f16x2 I think that is the better name (being consistent with the ptx ISA and as a bonus also familiar for Rust users)

@kjetilkjeka kjetilkjeka force-pushed the nvptx_f16_simd_intrinsics branch from 21ff034 to 79cb9fa Compare August 13, 2024 14:04
@kjetilkjeka
Copy link
Contributor Author

I updated the PR a little while ago. I also re-run the rustc assembly tests I have in my branch to verify the correct instructions are still being produced. I also ported a much used function in our (my workplace) internal code and verified that tests were passing. (And there was also a notable performance boost on the relevant benchmarks).

And with that, I think this one should be ready for merging.

@Amanieu Amanieu enabled auto-merge (rebase) August 19, 2024 18:53
@Amanieu Amanieu force-pushed the nvptx_f16_simd_intrinsics branch from 79cb9fa to 4d6fed8 Compare August 19, 2024 18:54
@Amanieu Amanieu merged commit d9466ed into rust-lang:master Aug 19, 2024
31 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants