Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

stdlib: add modulemap for Musl, add Musl platform module #62248

Merged
merged 3 commits into from
Jul 6, 2023

Conversation

MaxDesiatov
Copy link
Contributor

This change is necessary to allow building Musl platform module when targeting Alpine Linux.

@MaxDesiatov MaxDesiatov added Linux Platform: Linux standard library Area: Standard library umbrella static stdlib platform support labels Nov 25, 2022
@finagolfin
Copy link
Member

@egorzhdan, weren't you planning on removing these libc headers altogether now that you're using VFS to find them? Could we try that for the Musl libc and see if it works with this pull, or will it require more work?

@stevapple
Copy link
Contributor

Due to the long missing standardization of system module, Swift libraries now depend on Glibc heavily on almost every non-Darwin POSIX target. Using the name Musl will add extra burden and confusion to library authors.

We’re now hiding all the POSIX system libraries behind Glibc — and many of them are not Glibc at all. I don’t mind making Musl one of them before we come to the real solution by introducing CStdlib and re-introducing a variety of real system modules to provide extended functionalities.

@MaxDesiatov
Copy link
Contributor Author

MaxDesiatov commented Nov 29, 2022

Using the name Musl will add extra burden and confusion to library authors.

I'm not convinced that's the case. Musl is substantially different from Glibc, including its behavior. It's totally fine for Musl declarations to become available on import CStdlib, but for import Glibc to import something other than Glibc causes much more confusion for sure.

By making this a distinct module we're signaling to users that they need to audit their platform-specific code for any reliance on Glibc-specific behavior. Users not understanding or reporting bugs related to why their code behaves differently when not linked to Glibc, while they're led to believe they're somehow using Glibc, is something I'd like to avoid.

The fact that some other platforms came to a place where they unexpectedly hide their differences in behavior under a single Glibc module is unfortunate. This does not mean we should exacerbate the situation. What you mention as a burden should be resolved with something like import CStdlib. In the meantime, library and application authors supporting both Darwin and Glibc already conditionally import appropriate modules and tweak their code for platform-specific behavior. Doing the same for Musl is consistent with that approach.

@etcwilde
Copy link
Contributor

I 100% agree with the statement that @MaxDesiatov is saying. We shouldn't name Musl "Glibc", it's not. We have Swift-System, which is supposed to be a wrapper over the system libraries. It might be a good idea to point the corelibs at that instead of the raw LibC implementations.

@stevapple
Copy link
Contributor

In most cases developers are just using C standard APIs by import Glibc. If we really want to force the correct handling, the right thing to do is to first give a solution for the noise made by importing C standard headers on different platforms. Every system module we add will make 2 additional lines of code in such boilerplate — which is already ~7 lines now.

Another reason to postpone the introduction of Musl module is that we need to carefully design and structure system modules in a modern way, and this is possibly source breaking, or at least has great impact on call site. We could let Musl skip the premature stage, and directly jump into the modern era with other system/libcs.

@MaxDesiatov
Copy link
Contributor Author

MaxDesiatov commented Nov 29, 2022

Every system module we add will make 2 additional lines of code in such boilerplate — which is already ~7 lines now.

This is a very small price to pay for correctness of code and its behavior.

this is possibly source breaking

Could you elaborate on this? There's no Swift code targeting Musl in existence yet, an addition of a module will not have an impact on anyone not using it. This is purely an additive change.

Another reason to postpone the introduction of Musl module is that we need to carefully design and structure system modules in a modern way

That's out of scope for this module. I also don't see how its addition could somehow conflict with what you're describing, which is already going on in the Swift System repository, if I understand correctly? Those who need Musl ASAP can import it freely and adapt their code to its behavior. Users who don't need Musl can just ignore its existence altogether, in the same way they did before, they won't be impacted in any way.

@stevapple
Copy link
Contributor

stevapple commented Nov 29, 2022

This is a very small price to pay for correctness of code and its behavior.

I don’t think that’s a small price because we need 2 additional lines in nearly every file where C standard library is involved. And we’re not doing anything more correct by specializing Musl in addition to Glibc, because many libraries are not relying explicitly on either of them — they just need a working libc.

When to import Musl will also become a problem because nowadays many libraries just import Glibc on #if os(Linux). Adapting for Musl Linux could be a big problem and may require updating #if checks.

There's no Swift code targeting Musl in existence yet, an addition of a module will not have an impact on anyone not using it. This is purely an additive change.

There are motivations for a bunch of libraries to update and support Musl. There’re a variety of "Linux-compatible" Swift packages and by introducing Musl as system library they just suddenly fail to support all Linux flavors. Musl is very different from Glibc, but making Glibc the name an alias won’t make anything different to anyone not using it. Instead, naming it differently will certainly stop some of existing codes to gain effortless support for Musl Linux.

Those who need Musl ASAP can import it freely and adapt their code to its behavior. Users who don't need Musl can just ignore its existence altogether, in the same way they did before, they won't be impacted in any way.

No they won’t. An end user will have to wait for all dependent packages to update for Musl, or they’ll need to maintain their own forks. This is super unfortunate too, and could take even longer than CStdlib is worked out.

@stevapple
Copy link
Contributor

stevapple commented Nov 29, 2022

That's out of scope for this module. I also don't see how its addition could somehow conflict with what you're describing, which is already going on in the Swift System repository, if I understand correctly?

Swift System is only abstracting some popular, key functionalities from each system library, and this is not enough in some specific cases where the user explicitly depends on a libc implementation. That is, ideally, the only place we need to import Musl, and that’s where different libcs have real differences. The system module can be nicely structured — Darwin already sets an example, and eventually we could have such modulation on every platform for more fine-grained import surface control.

@MaxDesiatov
Copy link
Contributor Author

I don't think there's anything here that prevents future modularization. If anything changes on that front, the Musl module would in turn re-export those more granular modules. But that's a concern separate from the actual naming of the "root" Musl module.

If you have a concrete proposal for how the Musl module itself could be structured, please add inline comments to the respective lines of the diff.

@finagolfin
Copy link
Member

While @stevapple makes some valid points, I think splitting off Musl like this is better, as we should ideally do so for all the system C libraries. I would like to do so for Bionic one day- that would require proper deprecations and hopefully an easy future-proof solution like import CStdlib at the same time for current Android users- but since nobody is using Musl with Swift today, I think it is better to make this additive change, even though it currently doesn't allow reusing unmodified Swift packages as @stevapple points out.

All those packages will be modified one day anyway, nothing wrong with the few beginning Musl users starting to do so after this pull.

@stevapple
Copy link
Contributor

stevapple commented Nov 30, 2022

Can we introduce some (though ugly and hardcoded) compiler logic to downgrade “using import Glibc when Musl is available” to a warning?


I would prefer to either rush CStdlib into Swift 5.8 or delay Musl to Swift 5.9, so most codebases can directly update to the ideal solution.

@finagolfin
Copy link
Member

I would prefer to either rush CStdlib into Swift 5.8 or delay Musl to Swift 5.9, so most codebases can directly update to the ideal solution.

AFAIK, nobody is working on CStdlib atm, so I'd rather not delay this for a cross-platform solution that may not be here for awhile. Ideally, somebody using Swift on many platforms, say @mstokercricut or @ephemer, would pick up or sponsor the initiative for CStdlib but I don't see it happening yet.

//===----------------------------------------------------------------------===//

module SwiftMusl [system] {
header "SwiftMusl.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to inject this modulemap file into the Musl directory using LLVM VFS.
We already use this mechanism to inject glibc.modulemap into the Glibc installation (swift::getClangInvocationFileMapping). That way the Musl headers could be referenced directly from the modulemap instead of using an additional SwiftMusl.h header, preventing another module from hijacking the system headers: with the current approach another C module that includes the same system header might take ownership over the header, which would cause compile errors in some cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any plans to remove the SwiftGlibc.h header soon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here, is the approach suggested in https://forums.swift.org/t/problems-with-swiftglibc-and-proposed-fix/37594 no longer applicable? Wasn't a single header referenced from the modulemap, importing the rest of the headers the preferred solution?

Copy link
Contributor

@egorzhdan egorzhdan Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That approach has a flaw that might cause a compile error related to the modularization in the following scenario:

Consider this modulemap:

module B {
  header "b.h"
}
module C {
  header "c.h"
}

and three headers:

// a.h
class Decl {}
// b.h
#include "a.h"
// c.h
#include "a.h"

Notice that a.h is not referenced from the modulemap. As the forum post mentions, this is not an issue for Clang:

If the platform does not provide a module map for a given header, it will be included textually, both into the SwiftGlibc module and into any imported C / Objective-C / C++ modules that include the header. All of these modules will therefore contain the definitions from this header.
This is fine under Clang’s rules for modules; Clang has logic to deduplicate definitions caused by textual inclusion of the same header into multiple modules.

However, an important part that the forum post misses is that while Clang can handle this, Swift cannot. What Swift does in this case is it assumes that the first header to include a.h owns the header, in a sense that a.h is assigned to the same module as either b.h or c.h, but we can't really say which one is it.

This situation might occur in the system header scenario: assume we're compiling a Swift program that imports two C modules: SwiftGlibc/SwiftMusl and some other user-provided module. If the latter also contains an #include <some_stdlib_header.h>, Swift might treat some_stdlib_header.h as a part of the user-provided module. This will cause a compile error for a perfectly valid code:

import SwiftMusl
someFunctionFromTheStdlib()
// error: user-provided module not imported

The injection approach avoids both of the issues described in the forum post: symbols such as pid_t are available, and there is no cyclic dependency between libc++ and SwiftGlibc. I can explain this in a bit more detail if you'd like.

At the moment libstdc++ modulemap relies on the new injection approach, while SwiftGlibc still has a SwiftGlibc.h header with a list of #include<>-s. The idea is to eventually get rid of SwiftGlibc.h and refer to all of the glibc headers from the modulemap directly.

Any plans to remove the SwiftGlibc.h header soon?

@buttaface no specific timeline as of now, but contributions are welcome 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, I don't know if it's strictly necessary to use the injection approach for SwiftMusl right now, I'm just pointing out an issue that might arise in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxDesiatov, I brought this up so that you could try out this new approach with Musl too, if interested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to inject this modulemap file into the Musl directory using LLVM VFS.
We already use this mechanism to inject glibc.modulemap into the Glibc installation (swift::getClangInvocationFileMapping).

Slightly off-topic here, but does this mechanism work on Windows? Looks like it could solve a big pain on Windows installation. Nowadays Swift on Windows installer will copy the modulemaps to UCRT/WinSDK/MSVC accordingly. This is somehow fragile because we need to repeat this process every time after the environment is updated, and since the detection is hard-coded we cannot support any version of Visual Studio at the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(On the Windows front, I believe this is actually what the Windows build does now.)

@bscothern-Cricut
Copy link

bscothern-Cricut commented Nov 30, 2022

@buttaface So I just finished talking with @mstokercricut a little about this. We don't know that we have the expertise with the compiler to do the CStdLib work but are interested. So I am hoping to look into it a little bit in a couple of weeks, I have to wrap up some other stuff first. I'll start spending some time looking on the forums and here on GitHub to get up to speed on previous discussions on the idea before looking at actually implementing it.

@finagolfin
Copy link
Member

@bscothern-Cricut, I was just looking into this and I don't think it will require any compiler expertise to add CStdlib, as it's just a Swift module that picks the right libc for you. It's more a matter of testing it out on all the platforms, which highly cross-platform devs like you, @ephemer, or @tristanlabelle are better situated to do, as I only use Swift on Android, then getting it approved in Swift Evolution to ship with all toolchains.

I'm looking into splitting off Bionic as its own module, which doesn't appear to require much either and given the current Musl work, shouldn't require going through Evolution either. But ideally we would switch all Swift packages to import CStdlib once and for all first, rather than adding another #elseif canImport(Bionic) everywhere.

@MaxDesiatov
Copy link
Contributor Author

we would switch all Swift packages to import CStdlib once and for all first, rather than adding another #elseif canImport(Bionic) everywhere

If adding support for a new libc amounted to adding only #if canImport(YourLibc) in a few places, such CStdlib or Libc module re-exporting a corresponding libc on each platform would've been created long ago. But it's much more complicated than that, these platform-specific libcs differ substantially from each other. They provide different APIs and different types, writing code against them needs additional #if compile-time checks to adjust client code for each specific API. I've created close to a dozen patches across different repositories to support Musl and significant proportion of them required these libc-specific adjustments.

Creating a single CStdlib module simply re-exporting existing libc modules would do more harm than good and wouldn't be much different from the existing Glibc mess we're in right now. It would create a false illusion that one can use the same API from the same module on all platforms, which is very far from the actual reality.

@finagolfin
Copy link
Member

writing code against them needs additional #if compile-time checks to adjust client code for each specific API. I've created close to a dozen patches across different repositories to support Musl and significant proportion of them required these libc-specific adjustments.

That's irrelevant: the only benefit of CStdlib in that pitch is to replace the boilerplate import block for multiple libcs, that we see in all these Swift packages today, with a single import CStdlib. Nobody is suggesting that the additional modifications for specific libc APIs that you point out would not also be needed.

Creating a single CStdlib module simply re-exporting existing libc modules would do more harm than good and wouldn't be much different from the existing Glibc mess we're in right now.

I disagree. For one, there is no way to specialize on Bionic right now, whereas adding CStdlib and Bionic and OpenBSDLibc would allow us to use the first to simplify importing, while using the specific libc names to specialize where needed.

It would create a false illusion that one can use the same API from the same module on all platforms, which is very far from the actual reality.

The benefit of CStdlib is very limited, simply replacing the 12+ line import block in that linked pitch with a single line. It would in no way guarantee availability of any libc APIs and specializations like you're adding for Musl would still be needed.

We would just need to be careful to communicate that, but since such cross-platform support is an advanced feature, I suspect that most calling such C APIs would understand that.

@ephemer
Copy link

ephemer commented Jul 3, 2023

FWIW I don't consider myself anywhere near experienced enough with libc to want to direct or shape this effort in one way or another. I'd always appreciate reduced boilerplate, but tbh I rarely use import Glibc etc anyway. My uses normally boil down to some low-level file reading/writing in order to avoid Foundation.

I appreciate the move to support Musl and Alpine @MaxDesiatov

This change is necessary to allow building Musl platform module when targeting Alpine Linux.
@MaxDesiatov MaxDesiatov force-pushed the maxd/alpine-musl-modulemap branch from 57e342e to 0f5617e Compare July 5, 2023 21:27
@MaxDesiatov MaxDesiatov marked this pull request as ready for review July 5, 2023 21:40
@MaxDesiatov MaxDesiatov requested a review from egorzhdan July 5, 2023 21:40
@MaxDesiatov MaxDesiatov requested review from etcwilde and al45tair July 5, 2023 21:40
@MaxDesiatov
Copy link
Contributor Author

@swift-ci test

@MaxDesiatov
Copy link
Contributor Author

@swift-ci build toolchain

@MaxDesiatov MaxDesiatov self-assigned this Jul 5, 2023
@MaxDesiatov
Copy link
Contributor Author

@swift-ci test macos

@al45tair
Copy link
Contributor

al45tair commented Jul 6, 2023

The benefit of CStdlib is very limited, simply replacing the 12+ line import block in that linked pitch with a single line. It would in no way guarantee availability of any libc APIs and specializations like you're adding for Musl would still be needed.

I don't think we should do a "minimal" CStdlib that "just" includes the headers for the relevant C library. IMO that would be a mistake. There are a number of areas where your code will end up festooned with #ifs anyway if you do that, because of issues like the exact type of a pthread_t, whether dirent contains d_namlen, and so on. If we're having CStdlib I think we should be a little more ambitious.

In particular, I think we should aim to provide a uniform set of functionality that can be used anywhere, with reasonable affordances where functionality is missing on some platform or other. In particular, I don't think it'd be much to ask to make the types from the C standard library and (where supported) POSIX library import into Swift in a uniform manner. They're standardised in C, but for reasons to do with the way the Swift importer works we sometimes end up with variations on them. There are also a few areas, like stdio, where typical C libraries use macros to provide quite substantial acceleration for some functions, which right now doesn't happen if you use the same function from Swift because we don't import complex macros.

(Note: there is some precedent for this kind of thing on the C side, in particular GNU's libiberty. I don't think we have the same set of problems — e.g. we aren't going to need to supply a missing implementation of memcpy() or strnlen() — but it's a similar idea.)

I have some ideas in that area and want to investigate it and write up a proper proposal for what I think we should do, but I think getting Musl support basically working first is a useful thing for us to do for other reasons. I'm aware that that, temporarily, means that various projects are going to gain yet another #if at the top.

${SWIFT_STANDARD_LIBRARY_SWIFT_FLAGS}
${swift_platform_compile_flags}
LINK_FLAGS "${SWIFT_RUNTIME_SWIFT_LINK_FLAGS}"
TARGET_SDKS ALPINE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this patch can go in like that, because ALPINE isn't an SDK. I also don't think ALPINE should be an SDK. I'll talk to you offline about this part.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that Alpine shouldn't be an SDK and suggested so a couple weeks ago, including removing Android as an SDK, but got no response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the differences here are to do with libc, but I think it's not unreasonable for Swift to refer to that as an SDK. I just think it should probably be called MUSL rather than ALPINE, since the latter is just one distribution that happens to use musl in some configuration (I guess ANDROID should really be BIONIC on that basis, as it's probably using the same libc as Fuscia?).

But perhaps more pertinently, I'd like to be in a situation where we can build a compiler that has both the LINUX and MUSL SDKs available, where right now we tend to just configure a single SDK for the UNIX variant we're building on.

Copy link
Contributor Author

@MaxDesiatov MaxDesiatov Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think it should probably be called MUSL rather than ALPINE, since the latter is just one distribution that happens to use musl in some configuration

That's addressed in the latest commit of this PR. I'm not defining the SDK itself yet to make this PR smaller for easier review and for overall efforts to be incremental, so technically this PR is an NFC for all other platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to be in a situation where we can build a compiler that has both the LINUX and MUSL SDKs available, where right now we tend to just configure a single SDK for the UNIX variant we're building on.

@al45tair, my suggestion in that prior comment was that we just use the LINUX SDK for regular glibc distros, Android, and Alpine, but then provide a way to choose the libc we want to build against, similar to how you added choosing a threading library now.

I don't know what libc Fuchsia uses, but then switching the libc for other OS's should be easy, eg if one wants to use glibc on Windows someday.

@finagolfin
Copy link
Member

I don't think we should do a "minimal" CStdlib that "just" includes the headers for the relevant C library.

That's what we do now for each libc, ie it is the status quo.

If we're having CStdlib I think we should be a little more ambitious. In particular, I think we should aim to provide a uniform set of functionality that can be used anywhere, with reasonable affordances where functionality is missing on some platform or other.

That much less ambitious pitch has been sitting there for two years now and nothing has been done, so naturally I wasn't expecting something more to be done.

I have some ideas in that area and want to investigate it and write up a proper proposal for what I think we should do

Great, hope it goes through.

@MaxDesiatov MaxDesiatov requested a review from al45tair July 6, 2023 14:45
@MaxDesiatov
Copy link
Contributor Author

@swift-ci test

@MaxDesiatov
Copy link
Contributor Author

@swift-ci build toolchain

@MaxDesiatov MaxDesiatov merged commit e8b1a2b into main Jul 6, 2023
@MaxDesiatov MaxDesiatov deleted the maxd/alpine-musl-modulemap branch July 6, 2023 21:33
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Linux Platform: Linux platform support standard library Area: Standard library umbrella static stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants