-
Notifications
You must be signed in to change notification settings - Fork 13.4k
x.trailing_zeros() > n
is not optimized as well as x & ((1 << n) - 1) == 0
on x86
#43024
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
It looks like trailing_zeros currently calls an LLVM intrinsic (cttz) which has a patch posted against it -- https://reviews.llvm.org/D9284 -- perhaps someone familiar with LLVM (cc @arielb1) could push that through review and then change libcore to not have the conditional that it does today: https://github.com/rust-lang/rust/blob/master/src/libcore/num/mod.rs#L1375-L1390. |
@Mark-Simulacrum A variant of that patch was implemented via llvm-mirror/llvm@1886c8e in 2016, so we should definitely have it in all LLVM versions we support and can drop that workaround. However, I don't think this is really related to the issue seen here -- LLVM just doesn't recognize this particular pattern (probably because it would be rather odd in C). Godbolt for reference: https://godbolt.org/z/ovqsCg I'm a bit stumped about the u8 cttz codegen though. If I disable all optimizations, I get:
which directly calls |
Yeah, looks like the macro just isn't doing what it's intended to do: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=33539371298f2c808a22ba3cf40db567 |
Remove u8 cttz hack This issue has since been fixed in LLVM: llvm-mirror/llvm@1886c8e Furthermore this code doesn't actually work, because the 8 literal does not match the $BITS provided from the macro invocation, so effectively this was just dead code. Ref rust-lang#43024. What LLVM does is still not ideal for CPUs that only have bsf but not tzcnt, will create a patch for that later. r? @nagisa
Partially implemented in https://reviews.llvm.org/D55745. This will handle only |
This was fully fixed by https://reviews.llvm.org/D56355, which unfortunately did not make it into the last LLVM update. |
There was another LLVM update since then, and this issue is now fully fixed. |
the bitshift and bitand version optimizes to
while the
trailing_zeros
version optimizes toeven though I find the
trailing_zeros
version much more straight forwardShould this be reported upstream in llvm or is this something a mir pass should do?
The text was updated successfully, but these errors were encountered: