Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Our x86-32 target names are inconsistent #136495

Open
5 tasks done
RalfJung opened this issue Feb 3, 2025 · 51 comments
Open
5 tasks done

Our x86-32 target names are inconsistent #136495

RalfJung opened this issue Feb 3, 2025 · 51 comments
Labels
A-targets Area: Concerning the implications of different compiler targets C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Feb 3, 2025

The typical naming scheme we use for x86-32 targets is:

  • i686 means Pentium 4 (yes that makes no sense but, well, it's too disruptive to change now and i786 didn't catch on as a name anywhere), which in particular have SSE2
  • i586 means "original Pentium" (no SSE)

We have some targets that violate this:

If we want to establish the pattern that "i686 has SSE and the rest does not", then the last four of these should be renamed. (These are all tier 3 targets.) I wonder if there is a specific reason that these names were picked diverging from our usual naming scheme, or is it just an oversight because our naming scheme is admittedly not very self-explaining?

  • The Apple target is ancient and I assume was picked for consistency with how Apple calls this -- not sure if that should overwrite our own naming scheme.
  • For Hurd and Redox, we have no other targets that use PentiumPro without SSE as baseline, so it's a bit unclear what one would even use -- they are somewhere between i586 (original Pentium) and what we call i686 (Pentium 4). The most consistent outcome here would be to use Pentium 4 as the baseline like we use for all other OSes; not sure why Hurd and Redox should be special.
  • The i586-pc-nto-qnx700 one however should almost certainly be called i686.

Pinging the listed target maintainers and some other folks:
Cc @bjorn3 @workingjubilee @badboy @deg4uss3r @madsmtm @sthibaul @jackpot51 @flba-eb @gh-tr @jonathanpallant @japaric

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 3, 2025
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-discussion Category: Discussion or questions that doesn't represent real issues. A-targets Area: Concerning the implications of different compiler targets and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 3, 2025
@jonathanpallant
Copy link
Contributor

For QNX Neutrino 7.0.0, that is the name that Blackberry QNX use for their C toolchain:

~/qnx700/host/darwin/x86_64/usr/bin/i586-pc-nto-qnx7.0.0-ld

I think the primary question you all need to ask yourselves is:

  • Is it better for targets to be internally consistent (i.e. within rustc), or is it better for each target be consistent with other toolchains for that target

I've already had this argument about arm64e-apple-darwin (which in my view should have been aarch64-apple-darwin-something), and I lost. But maybe you'll have another ticket for the Arm 64-bit targets.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 3, 2025

I was wondering about the arm64e prefix... but I didn't plan to check all our targets, now. I am too scared of what I may find. ;)

@bjorn3
Copy link
Member

bjorn3 commented Feb 3, 2025

arm64ec-pc-windows-msvc is similar in weird naming, though it's behavior is much less cursed than in C where it literally makes target detection using macros think it is running on x86 with x86 vendor intrinsics emulated by the compiler.

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

For the last 2, we have no other targets that use PentiumPro without SSE as baseline

I don't remember the details, but basically we targetted i686 as defined in Debian:

https://wiki.debian.org/ArchitectureSpecificsMemo#i386-1

(and also gcc)

The most consistent outcome here would be to use Pentium 4 as the baseline like we use for all other OSes

? That would assume SSE2, which is wrong? #82435 (only pentium-M has SSE2)

Adding MMX and SSE should however be fine enough, even if the very original pentiumpro didn't have them.

@workingjubilee
Copy link
Member

Debian also advises that you not use a CPU less than Pentium 4.

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

Debian also advises that you not use a CPU less than Pentium 4.

This is just a recommendation.

Debian can actually be used on way less-powered i686 systems if you stick to a minimal system. But you cannot expect to be able to do what you were used to be doing 20 years ago with the Debian distribution at the time: code has gotten much larger.

@workingjubilee

This comment has been minimized.

@sthibaul

This comment has been minimized.

@workingjubilee
Copy link
Member

For note: Currently, the Debian-distributed rustc patches our i686-unknown-linux-gnu to use Pentium Pros instead of Pentium 4s. This results in them getting bugs and test failures in Rust code because the code generation doesn't use SSE2 registers, thus is unsound due to various bugs in LLVM (and while I am told GCC is better, I'm not entirely sure). They then report this to crate maintainers. The crate maintainers, having deliberately developed against i686-unknown-linux-gnu as per rustc's definition, get very confused. They then get annoyed, when they find out this is entirely caused by Debian's insistence on divergence from rustc's definitions.

Unfortunately using real CPU names in targets like perhaps pentium4-unknown-linux-gnu didn't catch on, and neither did tuples like i786-unknown-linux-gnu (even if they are technically recognized by Autotools).

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

this is entirely caused by Debian's insistence on divergence from rustc's definitions

You mean caused by rust's insistence on assuming that i686 has SSE2.

@workingjubilee
Copy link
Member

Debian calls this architecture i386, but Debian will not run on an i386 CPU.

If you prefer, I could say, "Debian's insistence on making code unsound".

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

Debian calls this architecture i386

Yes, because changing the name would have meant people having to reinstall their system. The meaning of the "i386" port has changed with the reasonable hardware that you can really run indeed.

If you prefer, I could say, "Debian's insistence on making code unsound".

Where "unsound" is wholy defined by rust, which wrongly insist on i686 having SSE2.

Anyway, let's not just rehash #82435

@RalfJung
Copy link
Member Author

RalfJung commented Feb 3, 2025

Yes, Rust's name "i686" is inconsistent with how other parts of the ecosystem use that term. That's not great, but indeed that's #82435. But i686 consistently means "Pentium 4" for almost all Rust targets, so unless there are good reasons IMO Hurd and Redox should follow suit. Debian's use of the term is not as relevant here as precedent within Rust itself.

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

unless there are good reasons IMO Hurd and Redox should follow suit.

I exposed what I believe are good reasons. Then it's up to rust maintainers to consistently remain inconsistent.

@madsmtm
Copy link
Contributor

madsmtm commented Feb 3, 2025

i386-apple-ios
The Apple target is ancient and I assume was picked for consistency with how Apple calls this -- not sure if that should overwrite our own naming scheme.

I wasn't there when it was introduced, but I'd guess that it was for consistency with Clang, yeah. The target is mostly legacy nowadays though, and might be gone the next time we bump supported versions, I don't think there is that much value in renaming it.

(I'd rather rename aarch64-apple-darwin to aarch64-apple-macos, or even better aarch64-macos, but that's much more difficult).

@RalfJung
Copy link
Member Author

RalfJung commented Feb 3, 2025

I exposed what I believe are good reasons.

So the reason is "because Debian defines i686 that way"? It wasn't clear to me whether that was meant to be informational, or justification for how Rust defines its built-in targets. It also doesn't give a reason for why Hurd/Redox should be different than Linux/FreeBSD/NetBSD/OpenBSD.

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

So the reason is "because Debian defines i686 that way"?

No. "because i686 normally doesn't have SSE2". I mentioned Debian's target as an explanation why it was done so originally. And then brought arguments why one could want to remain so.

why Hurd/Redox should be different than Linux/FreeBSD/NetBSD/OpenBSD.

Because it's technically wrong to say that i686 has SSE2.

But again, I'm not a maincore rust maintainer. I have no will to fight over such a thing. I'm asked my opinion for the reason, I give it, that's all.

@workingjubilee
Copy link
Member

Where "unsound" is wholy defined by rust, which wrongly insist on i686 having SSE2.

No.

@sthibaul: Rust defines f32 and f64 to be IEEE754 binary32 and binary64 floating point numbers, and with SSE2 registers, this is correctly implemented. But LLVM does not compile code using the x87 float registers correctly. LLVM does not even compile such code in a way consistent with any sort of use of an "extended precision" scheme that might be justifiable in C, since it introduces various classic errors like double rounding or spontaneous truncation that corrupts the floating point value, which should simply not occur. We have gone to great lengths to try to eliminate the usage of any of the floating point registers for 32-bit x86 targets which do not have SSE2, at least when Rust calls Rust.

In other words #82435 is not a random bug, it is intimately related to #114479 existing. Fixes for this situation are underway but implementing them will not help if Debian maintainers pretend their choices are merely about nomenclature and patch them out, just like they patched the target's target_cpu string. If you want to insist on a certain naming scheme, that's one thing, but it needs to be done with full awareness that what you are tampering with can affect program correctness negatively.

@sthibaul
Copy link
Contributor

sthibaul commented Feb 3, 2025

Where "unsound" is wholy defined by rust, which wrongly insist on i686 having SSE2.

No.

Well, "yes". Various i686 processors do not have SSE2, and that can pose real problem: your program just crashes.

I completely understand that not having SSE2 can pose floating point issues, and that it becomes reasonable for rust to just requires SSE2. Simply, technically it's not i686 any more.

Again, I'm just asked my opinion on this, so I give it. I can understand that there are more important problems that assuming SSE2 solves, and using i786 would pose other issues, and I'm fine with bumping to SSE2 due to this. But don't make me eat that i686 always has SSE2. It doesn't.

Debian maintainers pretend their choices are merely about nomenclature

It's not about nomenclature. It's about not seeing gnome-shell starting to crash inside librsvg just after a mere software upgrade.

@jackpot51
Copy link
Contributor

jackpot51 commented Feb 3, 2025

As the maintainer of Redox and an owner of a Pentium II - I must stress that i686 identifies the P6 microarchitecture, used by the Pentium Pro, Pentium II, and Pentium III, which did not come with SSE of any kind until the Pentium III. This is the current baseline of the Redox operating system, and I'd prefer it remains that way.

@beetrees
Copy link
Contributor

beetrees commented Feb 4, 2025

I think the most important thing is for Rust to be consistent with itself: if other tools/distributions have a different ideas of what i[34567]86 mean, then they should be able to have a single mapping from their meanings to Rust's meanings, rather than having to depend on different mappings per target. Given that all tier 1/2 targets use the definitions of i586/i686 as described in the OP (and it was decided not to change this in #82435), I think the inconsistent tier 3 targets should be renamed to be consistent. Given the longstanding miscompilations with non-SSE2 floating point numbers, I think it's also useful to be able to say "only i586-* targets have the unsound miscompilations tracked in #114479".

(As a side note, does anyone know of any reason why Debian doesn't use the i586-unknown-linux-gnu target instead of patching the compiler?)

@RalfJung
Copy link
Member Author

RalfJung commented Feb 4, 2025

(As a side note, does anyone know of any reason why Debian doesn't use the i586-unknown-linux-gnu target instead of patching the compiler?)

That lowers the baseline down to the original Pentium, which I think is lowering it further than they want to.

I must stress that i686 identifies the P6 microarchitecture, used by the Pentium Pro, Pentium II, and Pentium III, which did not come with SSE of any kind until the Pentium III. This is the current baseline of the Redox operating system, and I'd prefer it remains that way.

Understood. However, Redox doesn't get to define the baseline supported by the Rust compiler. Everything older than the Pentium 4 is considered unsupported (tier 3 at most) by the Rust project, simply because the x87 FPU is so poorly behaved that it'd be a lot of work to make LLVM work correctly there, and nobody seems to be willing to put in that work.

Now, Redox already is a tier 3 target, so the only immediate consequence of this is that the target has no way of becoming a tier 2 target in its current form. However, it's also not great when tier 3 targets cause confusion by using terminology in a different way than what we established for our tier 1 & 2 targets. If I had a time machine I'd propose renaming the i686 targets to i786, though I imagine that would also cause enough problems to not be an obvious win, so for better or worse we use i686 to refer to Pentium 4. (This kind of mismatch is not unique to Rust; i386 is used by Debian and Apple to refer to things way newer than the original i386 CPU, and apparently QNX uses i586 to refer to a Pentium 4. Clang ignores the N in "iN86" entirely and always assumes a Pentium 4. So it's just not correct to claim that there is a coherent idea across the ecosystem of what i686 would mean, or what "at least Pentium 4" should be called.)

So IMO either the redox target should be renamed, or it should use the Pentium 4 baseline. (And same for the hurd target.) You seem to prefer a rename; the main challenge here is finding a good name.

@RalfJung RalfJung added the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Feb 4, 2025
@RalfJung
Copy link
Member Author

RalfJung commented Feb 4, 2025

Regarding the Android target, @maurer do you know if there is a reason that it uses pentiumpro rather than pentium4 as the baseline CPU? The features it then enables go well beyond what even the Pentium 4 had.

@workingjubilee
Copy link
Member

To restate my point another way: When anyone involved in this mess insists on a particular definition for a term already in use elsewhere, citing an imagined consensus that the definition is correct according to some convention, they expose themselves to others insisting on a different definition. It doesn't matter which is right or wrong, because when it then comes to a third party, that party cannot conform to two definitions at once and still make sense. And at the risk of sounding overly cynical, as long as there are two people who claim they get to define the term... which includes "every Linux distro"... then we will be wrong according to one.

Some of the decisions we have made have been an attempt to be consistent with external toolchains. However, I don't believe it's very useful to try to be consistent with these external toolchains when the inevitability is that eventually we become inconsistent with them anyways. Any scheme we might imagine to be "consistent" can be patched into inconsistency by others, because it makes a program "work", even if the resulting program is not actually sound. We cannot rely on them to tell us, either, when they patch rustc.

In other words, I agree that we must consider ourselves the final authority on what a target tuple's name means. If we happen to find it useful to match the meanings of someone else, then we should only do so as long as it preserves our own self-consistency. Due to various patches and inconsistencies "in the wild", we cannot really save people from the need to map tuples in various ways.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 8, 2025
i686-unknown-hurd-gnu: bump baseline CPU to Pentium 4

See rust-lang#136495 for context. ``@sthibaul`` (the only listed target maintainer) said they would be [fine](rust-lang#136495 (comment)) with this change.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 8, 2025
Rollup merge of rust-lang#136700 - RalfJung:hurd, r=Noratrieb

i686-unknown-hurd-gnu: bump baseline CPU to Pentium 4

See rust-lang#136495 for context. ``@sthibaul`` (the only listed target maintainer) said they would be [fine](rust-lang#136495 (comment)) with this change.
@maurer
Copy link
Contributor

maurer commented Feb 10, 2025

Regarding the Android target, @maurer do you know if there is a reason that it uses pentiumpro rather than pentium4 as the baseline CPU? The features it then enables go well beyond what even the Pentium 4 had.

tl;dr: Using pentium4 + sse3 should be safe for Android.

Sorry for the late reply. The line here for Android is that it's barebones x86, plus "MMX, SSE, SSE2, SSE3, and SSSE3". It's phrased this way rather than to a specific CPU baseline because we're trying to describe what is guaranteed to be available to an app when it runs, not a specific CPU, and "pentium 4" is not an instruction set. Pentium 4s only ever had mmx/sse/sse2/sse3, so any interpretation of this will only select instructions available on all Android devices.

The platform's current minimum architecture is prescott, so the platform does not consider CPUs older than the Pentium 4. The last time it's possible anything earlier than prescott was considered was older than 2015 - I can do more digging if you want to check for sure, but I'm pretty sure that prescott and atom are the minimum we've ever built for.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 11, 2025
…=jieyouxu

i686-linux-android: increase CPU baseline to Pentium 4 (without an actual change

As per `@maurer's` [comment](rust-lang#136495 (comment)), this shouldn't actually change anything since we anyway add a bunch of extensions that bump things up way beyond Pentium 4. But Pentium 4 is consistent with the other i686 targets and I don't know enough about the exact sequence of CPU generations to be confident with more than this. ;)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 12, 2025
…=jieyouxu

i686-linux-android: increase CPU baseline to Pentium 4 (without an actual change

As per ``@maurer's`` [comment](rust-lang#136495 (comment)), this shouldn't actually change anything since we anyway add a bunch of extensions that bump things up way beyond Pentium 4. But Pentium 4 is consistent with the other i686 targets and I don't know enough about the exact sequence of CPU generations to be confident with more than this. ;)
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 12, 2025
Rollup merge of rust-lang#136885 - RalfJung:linux-android-base-cpu, r=jieyouxu

i686-linux-android: increase CPU baseline to Pentium 4 (without an actual change

As per ``@maurer's`` [comment](rust-lang#136495 (comment)), this shouldn't actually change anything since we anyway add a bunch of extensions that bump things up way beyond Pentium 4. But Pentium 4 is consistent with the other i686 targets and I don't know enough about the exact sequence of CPU generations to be confident with more than this. ;)
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Feb 12, 2025
Replace i686-unknown-redox target with i586-unknown-redox

This change is related to rust-lang#136495
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Feb 12, 2025
Replace i686-unknown-redox target with i586-unknown-redox

This change is related to rust-lang#136495
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 12, 2025
Rollup merge of rust-lang#136698 - jackpot51:i586-redox, r=RalfJung

Replace i686-unknown-redox target with i586-unknown-redox

This change is related to rust-lang#136495
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this issue Feb 13, 2025
Replace i686-unknown-redox target with i586-unknown-redox

This change is related to rust-lang/rust#136495
@apiraino
Copy link
Contributor

T-compiler had a look at this discussion thread last week (on Zulip). TL;DR is that we're fine proceeding with improving naming consistency, where it makes it more aligned with the vendor idea of that target naming.

@rustbot label -I-compiler-nominated

@rustbot rustbot removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Feb 13, 2025
@RalfJung
Copy link
Member Author

@apiraino that doesn't really answer the main remaining question which is what to do with targets where the vendor naming directly conflicts our own naming conventions.

@apiraino
Copy link
Contributor

apiraino commented Feb 13, 2025

sorry Ralf. Here's perhaps a better excerpt:

[...] when the vendor has made a clear decision regarding target naming, it's more important to be consistent with that (and thus the wider software ecosystem).

About i686:

I'm not sure that applies in the i686 case since so many different vendors have very different ideas about what that means. (It might be interesting to see what Intel's compiler does when asked to produce i686 code)

@RalfJung
Copy link
Member Author

RalfJung commented Feb 13, 2025 via email

@RalfJung RalfJung added the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Feb 13, 2025
@RalfJung
Copy link
Member Author

I re-added the nomination, to answer the very concrete question of what to do with i586-pc-nto-qnx700, a target with a Pentium 4 baseline. We typically use i586 to indicate "older than i686" (with i686 meaning "Pentium 4"), so there is a direct conflict here between the vendor convention and our own convention, which will easily be confusing for Rust users.

@apiraino
Copy link
Contributor

For QNX Neutrino 7.0.0, that is the name that Blackberry QNX use for their C toolchain:

~/qnx700/host/darwin/x86_64/usr/bin/i586-pc-nto-qnx7.0.0-ld

I think the primary question you all need to ask yourselves is:

* Is it better for targets to be internally consistent (i.e. within `rustc`), or is it better for each target be consistent with other toolchains for that target

@jonathanpallant it seems that @flba-eb suggests that i586-pc-nto-qnx700 could be a "good enough" candidate. Am I reading correctly that you don't have a different opinion or alternate suggestions?

@RalfJung
Copy link
Member Author

To be clear the new name would be i686-pc-nto-qnx700; that would be consistent with out other i*86 targets.

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Feb 20, 2025
…6, r=workingjubilee

Make x86 QNX target name consistent with other Rust targets

Rename target to be consistent with other Rust targets: Use `i686` instead of `i586`
See also
- rust-lang#136495
- rust-lang#109173

CC: `@jonathanpallant` `@japaric` `@gh-tr` `@samkearney`
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Feb 20, 2025
…6, r=workingjubilee

Make x86 QNX target name consistent with other Rust targets

Rename target to be consistent with other Rust targets: Use `i686` instead of `i586`
See also
- rust-lang#136495
- rust-lang#109173

CC: `@jonathanpallant` `@japaric` `@gh-tr` `@samkearney`
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 21, 2025
Rollup merge of rust-lang#137324 - flba-eb:rename_qnx_target_name_i586, r=workingjubilee

Make x86 QNX target name consistent with other Rust targets

Rename target to be consistent with other Rust targets: Use `i686` instead of `i586`
See also
- rust-lang#136495
- rust-lang#109173

CC: `@jonathanpallant` `@japaric` `@gh-tr` `@samkearney`
@jonathanpallant
Copy link
Contributor

It was merged before I had a chance to comment (unlike the PR that added it, which I recall took months to approve and merge). But yes, I have always said Rust should prioritise internal consistency over external; this change does that and I support it.

@RalfJung
Copy link
Member Author

Ah, I hadn't even seen #137324, nice. :)

That makes i386-apple-ios the only remaining culprit. The vibes given by @madsmtm were that this target will anyway be removed. Is there a timeline for that?

@madsmtm
Copy link
Contributor

madsmtm commented Feb 24, 2025

That makes i386-apple-ios the only remaining culprit. The vibes given by @madsmtm were that this target will anyway be removed. Is there a timeline for that?

Not really a timeline for it, no. If it's important to you, I'd be fine with renaming the target in the meantime (if so, it should be renamed to i686-apple-ios-sim).

@RalfJung
Copy link
Member Author

Not very important, no.

@apiraino
Copy link
Contributor

apiraino commented Feb 27, 2025

Last question about target names should be addressed in comment (iiuc)

Comment also provides a rename suggestion for i386-apple-ios.

@rustbot label -I-compiler-nominated

@rustbot rustbot removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Feb 27, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-targets Area: Concerning the implications of different compiler targets C-discussion Category: Discussion or questions that doesn't represent real issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests