-
Notifications
You must be signed in to change notification settings - Fork 13.5k
SimplifyDemandedBitsForTargetNode - Missing AArch64ISD::BIC & AArch64ISD::BICi handling #53881
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
@llvm/issue-subscribers-backend-aarch64 |
@llvm/issue-subscribers-good-first-issue |
Hi @RKSimon, can you expand a little bit what the idea of this work is? Do we expect that it triggers a rewrite? |
@sjoerdmeijer I'll try to find the work I was doing on #53622 and see if I can repro |
@davemgreen @sjoerdmeijer This is the kind of thing I had in mind: https://rust.godbolt.org/z/M6WbxTaYv define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) {
%x0 = zext <8 x i8> %a0 to <8 x i16>
%x1 = zext <8 x i8> %a1 to <8 x i16>
%hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1)
%res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511>
ret <8 x i16> %res
}
declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>) ->
The AND should be removable as the uhadd (ISD::AVGFLOORU) node should never have the top most bits set, but the AND gets coverted to a AArch64ISD::BICi node very early, so we need:
|
May I take this one? |
Sure, go for it |
#53622 needs addressing as well if you're interested :) |
Sure |
@RKSimon could you please provide an example similar to https://rust.godbolt.org/z/M6WbxTaYv but with BIC instead of BICi. Because I don't see how this optimization could be applied to BIC version without immediate, sorry. #76644 PTAL |
It wasn't necessarily for HADD etc. But you should be able to at least use KnownBits getMinValue/getMaxValue bounds to work out the known upper zero bits etc. |
…ndling (#76644) Fold BICi if all destination bits are already known to be zeroes ```llvm define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) { %x0 = zext <8 x i8> %a0 to <8 x i16> %x1 = zext <8 x i8> %a1 to <8 x i16> %hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1) %res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511> ret <8 x i16> %res } declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>) ``` ``` haddu_known: // @haddu_known ushll v0.8h, v0.8b, #0 ushll v1.8h, v1.8b, #0 uhadd v0.8h, v0.8h, v1.8h bic v0.8h, #254, lsl #8 <-- this one will be removed as we know high bits are zero extended ret ``` Fixes #53881 Fixes #53622
These get lowered quite early, meaning that there are missed opportunities to further simplify the DAG based on the masked bits.
Noticed while looking at Issue #53622
The text was updated successfully, but these errors were encountered: