Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

aaarch64: there is no way to use the FPU/SIMD on a softfloat target without creating an ABI mess #110632

Open
RalfJung opened this issue Oct 1, 2024 · 25 comments

Comments

@RalfJung
Copy link
Contributor

RalfJung commented Oct 1, 2024

On aarch64, if one is building for a softfloat target (in Rust that's e.g. the target aarch64-unknown-none-softfloat, which in particular sets -neon,-fp-armv8 target features by default), there seems to be no way to build some code that does make use of the FPU while remaining ABI-compatible with the rest of the binary: to enable use of the FPU, we have to set +fp-armv8, but this will inevitably also change which registers are being used to pass float arguments around.

On other targets, one can have the target do something like set +soft-float, so even if someone now enables e.g. SSE features on x86 or fpregs on ARM, code will be built with the softfloat ABI and thus be compatible with this target. But aarch64 doesn't seem to have something like +soft-float, meaning it is impossible to disentangle the float ABI from whether FPU instructions can be used.

Is there a specific reason this is currently not possible on aarch64, or is it just that nobody implemented this yet? Or am I missing some other way to generate aarch64 code that uses the FPU but uses a softfloat ABI?

@pinskia
Copy link

pinskia commented Oct 1, 2024

Turning off simd and/or fp disallows all usage of float types (and vector types).

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

What does "disallow" mean? LLVM still happily compiles such code. Rust also generally supports f32/f64 on all targets, and LLVM is perfectly capable of building such code even on aarch64 with the FPU turned off.

(This is an LLVM issue, not a clang issue.)

@pinskia
Copy link

pinskia commented Oct 1, 2024

(This is an LLVM issue, not a clang issue.)

Oh right and GCC does not disconnect the target options part from the backend ...
Anyways there is no soft-float ABI defined by Arm. And why would there be?

@pinskia
Copy link

pinskia commented Oct 1, 2024

softfloat is not a target that makes sense why LLVM has it I have no idea.

@pinskia
Copy link

pinskia commented Oct 1, 2024

Also clang rejects that target:
clang++: error: version 'softfloat' in target triple 'aarch64-unknown-none-softfloat' is invalid

@pinskia
Copy link

pinskia commented Oct 1, 2024

aarch64-unknown-none-softfloat triple does not exist outside of rust. Why was it added in the first place?

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

Also clang rejects that target:

You could try emulating this on clang by setting the -neon,-fp-armv8 target features. I don't know how to set those flags in clang.

Anyways there is no soft-float ABI defined by Arm. And why would there be?

LLVM defines one, I can't tell you why. 🤷

@pinskia
Copy link

pinskia commented Oct 1, 2024

With -march=armv8+nofp, float types are rejected with clang:

<source>:1:7: error: 'f' requires 'float' type support, but ABI 'aapcs' does not support it

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

aarch64-unknown-none-softfloat triple does not exist outside of rust. Why was it added in the first place?

Seems like that happened in rust-lang/rust#64589. Cc @andre-richter @Amanieu
Also Cc some other Rust folks that may know more about the how and why here, @jacobbramley @Darksonn @thejpster

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

With -march=armv8+nofp, float types are rejected with clang:

Okay, maybe this is not possible to reproduce with clang then.

However, LLVM seems to offer support for a softfloat ABI on aarch64 targets -- without providing sufficient control over whether and when that ABI is used. That's fundamentally what this issue is about.

@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2024

@llvm/issue-subscribers-backend-aarch64

Author: Ralf Jung (RalfJung)

On aarch64, if one is building for a softfloat target (in Rust that's e.g. the target aarch64-unknown-none-softfloat, which in particular sets `-neon,-fp-armv8` target features by default), there seems to be no way to build some code that does make use of the FPU while remaining ABI-compatible with the rest of the binary: to enable use of the FPU, we have to set `+fp-armv8`, but this will inevitably also change which registers are being used to pass float arguments around.

On other targets, one can have the target do something like set +soft-float, so even if someone now enables e.g. SSE features on x86 or fpregs on ARM, code will be built with the softfloat ABI and thus be compatible with this target. But aarch64 doesn't seem to have something like +soft-float, meaning it is impossible to disentangle the float ABI from whether FPU instructions can be used.

Is there a specific reason this is currently not possible on aarch64, or is it just that nobody implemented this yet? Or am I missing some other way to generate aarch64 code that uses the FPU but uses a softfloat ABI?

@ostannard
Copy link
Collaborator

When we added the AArch64 soft-float ABI in clang (#74460, #84146), we deliberately didn't allow the combination of the soft-float ABI with floating-point hardware, to avoid there being multiple, incompatible ABIs for any one target.

However, the ABI used by clang is controlled by the -mabi= option, so you might be able to check how that's implemented and enable that combination in your compiler (is this for rust?). I'd be opposed to adding that option to clang without a very good use-case though, to avoid creating more compatibility issues.

@thejpster
Copy link

I don't understand the use case. AIUI Linux needs to avoid persisting FPU registers. This then implies a soft float ABI - but that's not the goal, merely an outcome.

Using the FPU means you need to persist the registers so ... why would you still want the soft-float ABI?

This is distinct from Arm 32-bit where the FPU is optional and two ABIs are defined and so compatibility with soft-float libraries is required.

@efriedma-quic
Copy link
Collaborator

clang currently has two "no-fp" configurations: -mgeneral-regs-only, and -mabi=aapcs-soft. Internally, they're actually the same thing, but -mgeneral-regs-only forbids any call that would pass a floating-point value in general registers. -mgeneral-regs-only is intended for places like the Linux kernel, which don't use any floating-point. -mabi=aapcs-soft is intended for microcontrollers that don't have an FPU.

It's hard for me to imagine a situation where you'd actually want a "soft-float" ABI and access to FP registers, given that all general-purpose AArch64 processors have an FPU. But if there is some use-case, we could look at adding it.

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

@efriedma-quic that's a fair question. The ability to enable neon and fp-armv8 on our aarch64-softfloat target has been added to Rust a few years ago, and I also don't quite know what the motivation for that is (I wasn't involved in the discussion back then)... but it's a feature we have now, it's hard to take back, so we'll likely have to support it.

We did, however, in the mean time come up with a plan that solves our problem and does not require LLVM changes. It boils down to avoiding the implicit LLVM-defined ABI for float types on the aarch64-softfloat target, and instead always passing them as i32/i64, with type conversions on the caller/callee side to convert from/to float types. This is architecturally a bit harder for us than if we could use the LLVM ABI but it should work.

@efriedma-quic
Copy link
Collaborator

It boils down to avoiding the implicit LLVM-defined ABI for float types on the aarch64-softfloat target, and instead always passing them as i32/i64, with type conversions on the caller/callee side to convert from/to float types.

This is going to break down in some edge cases... if you do have users for this, you probably want a real LLVM target feature.

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 1, 2024

Which kind of edge cases are you thinking about? We're using these kind of ABI overrides for a bunch of stuff already so I don't foresee any problems here -- in particular given that we don't have to be ABI-compatible with anything else since there is no standard softfloat ABI.

@efriedma-quic
Copy link
Collaborator

varargs (ABI rules require storing parts of floating-point registers when you va_start), intrinsics/libcalls where the backend generates a call.

@RalfJung
Copy link
Contributor Author

RalfJung commented Oct 3, 2024

It's hard for me to imagine a situation where you'd actually want a "soft-float" ABI and access to FP registers, given that all general-purpose AArch64 processors have an FPU. But if there is some use-case, we could look at adding it.

It seems like one motivation is for a kernel that is generally "nofloat" to temporarily enable use of the FPU in a kernel thread (after saving all the registers that clobbers), e.g. to execute some cryptographic code more efficiently.

@RalfJung
Copy link
Contributor Author

RalfJung commented Dec 16, 2024

intrinsics/libcalls where the backend generates a call.

Indeed we are running into this now (see rust-lang/rust#134375).

And as mentioned above, there is a concrete motivation: the Linux kernel generally uses softfloat ABIs to avoid having to persist FPU registers. However, certain modules (e.g. cryptography, compression) want to make use of SIMD operations, so when those modules are executed, the kernel sets a flag ensuring that for the current kernel thread, FPU registers are persisted, and then jumps to code built with this target feature. We can "be careful" to avoid ABI issues here, but I hope at this point it is generally understood that "being careful" doesn't scale. The proper solution is to consistently use a softfloat ABI for all code, but LLVM currently makes this impossible.

On every other target (I looked at x86-32/64, arm-32, riscv-32/64), LLVM allows independently controlling the float ABI and the availability of hardware float support. It seems like a mere oversight that aarch64 doesn't allow this, and it'd be great to see this fixed.

@nikic
Copy link
Contributor

nikic commented Dec 16, 2024

On every other target (I looked at x86-32/64, arm-32, riscv-32/64), LLVM allows independently controlling the float ABI and the availability of hardware float support. It seems like a mere oversight that aarch64 doesn't allow this, and it'd be great to see this fixed.

On 32-bit arm this is controlled via -float-abi, on RISC-V this is controlled by -target-abi. Can you share how this is possible to do on x86? If I try something like +soft-float,+sse2 it becomes effectively the same as +soft-float, because hardware float is not used for the calculations either: https://llvm.godbolt.org/z/d3xon3oao

IMHO the way this should be working is that +soft-float is about the generated instructions, while float-abi/target-abi should control the ABI. The problem is that we don't support explicit float ABI specification for most targets. (Things like +soft-float,+neon is a contraction in terms under this model.)

@RalfJung
Copy link
Contributor Author

RalfJung commented Dec 27, 2024

Can you share how this is possible to do on x86? If I try something like +soft-float,+sse2 it becomes effectively the same as +soft-float, because hardware float is not used for the calculations either: https://llvm.godbolt.org/z/d3xon3oao

I can only say what the Rust targets do -- they set +soft-float.

IMHO the way this should be working is that +soft-float is about the generated instructions, while float-abi/target-abi should control the ABI. The problem is that we don't support explicit float ABI specification for most targets. (Things like +soft-float,+neon is a contraction in terms under this model.)

ISTM that on x86, +soft-float is meant to indicate the ABI, since otherwise there'd be no reason to even have this target feature alongside sse etc. On arm-32, there's a similar soft-float feature. Rust seems to always set TargetOptions::abi to "eabi" (not sure what that corresponds to in LLVM, I assume -target-abi?) and +soft-float for softfloat ARM targets. Are we setting -float-abi anywhere?

If +soft-float,+sse2 just ignores the SSE part, then what is even the point of the soft-float x86 target feature?

On 32-bit arm this is controlled via -float-abi,

What is going on in https://godbolt.org/z/v4f1hsqjh ? This is a target that explicitly sets the ABI to eabi. And yet it seems like -soft-float,+fpregs makes it use a different ABI than when no target features are set? I can't read assembly so I can't tell if there's a meaningful difference here. It definitely still uses soft-floats despite those being disabled.

Is there any target where LLVM supports the combination of softfloat ABI and hardfloat instructions?

@RalfJung

This comment has been minimized.

@nikic
Copy link
Contributor

nikic commented Dec 28, 2024

Weird, I don't get notifications from this thread either.

Here are examples for using soft-float ABI with hard-float instructions:
ARM: https://llvm.godbolt.org/z/Eav1EG1qe (using -float-abi)
RISCV: https://llvm.godbolt.org/z/srMEq714n (using -target-abi)

@RalfJung
Copy link
Contributor Author

RalfJung commented Dec 31, 2024

Here are examples for using soft-float ABI with hard-float instructions:

Thanks for pointing me towards -float-abi, I have now completely redone how rustc handles float ABIs on ARM targets based on that. I don't understand why the soft-float target feature exists on ARM (LLVM will also fall back to softfloats when I set -mattr=-fpregs) but 🤷

So the summary is that for ARM-32 and for RISCV, LLVM provides reasonably clean and explicit ways to control the ABI. RISCV even warns when the ABI cannot be implemented; sadly the ARM backend does not do that but that seems like a separate issue.

However, for aarch64, the equivalent functionality is missing. LLVM does have a de-facto soft-float ABI that it will use on those targets when -neon,-fp-armv8 is set, but there is no way to properly use that ABI in a codebase where some parts are compiled with use of the FPU and some parts without. That's what this issue is about; this is functionality that we'd like to provide for our aarch64 softfloat target in Rust.

Alternatively if "returning floats without having the requisite target features" is not supported by LLVM on aarch64, I would expect at least a suitable warning indicating as much. However unfortunately Rust now already claims to support this combination (after all, LLVM seemed to support this) and this could be non-trivial for us to change/roll back...

(On X86, there is the option of setting +soft-float, which... I guess at least lets me use SSE intrinsics without affecting the float ABI? I am not entirely sure.)

@RalfJung RalfJung changed the title aaarch64: there is no way to use the FPU on a softfloat target without creating an ABI mess aaarch64: there is no way to use the FPU/SIMD on a softfloat target without creating an ABI mess Feb 4, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

8 participants