-
-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
coverage 6.1.1 musllinux wheel performance issues #1268
Comments
This will no longer reproduce with the above instructions because we landed pyca/cryptography#6511 as a workaround. To see the problem now you'll need to edit |
I've now tested this by installing the wheel on alpine 3.12 (which runs musl 1.1 and is what the musllinux image uses as a base), alpine 3.13 (the first with musl 1.2) and alpine 3.14 (which has musl 1.2.2, same as 3.13). The performance issue only occurs in alpine 3.14, which is...baffling (and should not yet be considered entirely true. I've run a lot of permutations and could have made a mistake). Happy to debug this further, although I'll need to figure out a reasonable profiling strategy for timing calls to the tracing func. |
I know nothing about the Linux underpinnings here, so I'm not sure how to help diagnose this. If there's something I should be doing differently in building distributions, I'm happy to make a change. One difference between 6.1 and 6.1.1: the versions of cibuildwheel and packaging changed:
|
Okay, I've figured out the performance issue. What's occurring is that it fails importing the CTracer, but this code makes it silently fall back to the pure Python tracer: coveragepy/coverage/collector.py Lines 19 to 33 in e4c3d30
Digging deeper, the issue appears to be with the naming of the tracer so file. In the musl wheel it is named I'm actually not familiar with the naming requirements for traditional CPython extensions (as all my work is done with |
Oh, hmm, 6.1 didn't have musl wheels at all. |
Some further info: Python 3.8 on Alpine 3.12 and 3.13 do successfully load the |
Seems like this should be reported upstream somewhere, but: cibuildwheel? Alpine? any thoughts? |
The thing where It would be good for auditwheel to check for this; that's an easy thing to do and likely to catch other buggy wheels. So it'd worth opening an issue there. I'm not sure who's responsible for the wrong name in the first place. If I'm right that CPython fixed a bug here, then it might be you need to build with the CPython that has the bug fix. (CPython keeps some internal metadata tables about stuff like "how should extension modules be named". So my hypothesis is that old CPython had the wrong metadata on alpine, which leads to both creating the wrong name at build time and searching for the wrong name at import time.) |
OK, that's two more candidates for upstream bug reports: auditwheel and CPython. I don't know how to untangle this. |
The auditwheel bug report is orthogonal to everything else -- it won't fix the problem, but it should be done in any case to stop the problem from spreading or coming back. It looks like I was right about CPython having the wrong metadata. With the official musllinux build image, I get a gnu tag:
And with alpine 3.13 I get a gnu tag:
But with alpine 3.14 I get a musl tag:
The Docker-maintained "official" Python images also have a gnu tag:
So either CPython fixed something, or else Alpine is hacking their python package to fix it and that hack should go upstream. And also probably most musl-based python docker images need to revved to get the new tags, including the official musllinux build image, the official docker "python" images that use alpine, and cryptography build images, ... And maybe pip should detect when it's running on a python with a broken SOABI tag, and refuse to install musllinux wheels? |
I've continued the discussion on discuss.python.org, but @nedbat I'd suggest deleting the musllinux wheels for at least 3.9 for the moment. As is, users on Alpine 3.14 (the current latest) who install the distribution Python will experience this problem. Deleting the wheel will require everyone using 3.9 on Alpine to compile it themselves, but that was the status quo until a day ago anyway 😄 Other versions of the wheel actually work fine, so while the soabi naming is wrong you could leave them up for now if you wanted. |
I appreciate the detailed guidance. I've delete the three 3.9 musl wheels. |
I would like to try to test for these sorts of problems, but it seems impractical. I can add a simple check to ci_buildwheel, but it wouldn't catch this particular problem unless it was run on Alpine, and a specific version of Alpine. Any advice? |
This is, unfortunately, not the sort of case that I would expect to be caught even in a very rigorous CI, since it's the interaction of wheels with distro versions they weren't originally built on and musllinux/manylinux exist precisely to allow that. To catch this type of bug you'd need to modify your wheel build process such that you store the artifacts, fetch them, and import them (with |
OK, that's what I thought. I guess for now I will simply avoid 3.9 musllinux wheels. |
Commit d538c55 removes the 3.9 musllinux wheels during building. I don't know how I'll know when it's safe to put them back... |
This has been fixed, Alpine 3.14 and yesterday's 3.15 now load musllinux wheels, so it should be safe to include again. This was due to an unsupported patch to Python on Alpine, and was not an issue with musllinux. If anyone is using Alpine over a longer period of time, they will need to upgrade to the latest patched Python package. |
I've restored those wheels in commit cfe14c2 |
With the release of coverage 6.1.1 coverage now ships musllinux_1_1 wheels so that users don't need to compile on musl-derived distributions. This is great! However, the
cryptography
project has seen a massive performance regression with these wheels in our CI.coverage 6.1 compiled in an alpine-latest container running our test suite: 215.89s (0:03:35)
coverage 6.1.1 using the musllinux_1_1 wheel in the same container: 842.11s (0:14:02)
While the exact timings, of course, vary, the magnitude of slowdown is consistent. Locally I tested this by installing 6.1.1 with
--no-binary coverage
and the performance problem went away.To Reproduce
This should be reproducible with alpine-latest (which has python 3.9 at this time) and any project with a reasonably large test suite, but you can directly reproduce it with cryptography's own runner if you'd like:
The text was updated successfully, but these errors were encountered: