Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[CI] Build wheels for macOS, Linux and Windows #79

Closed
wants to merge 1 commit into from

Conversation

bauerj
Copy link

@bauerj bauerj commented Feb 6, 2021

Hey,

this creates binary wheels including dependencies for the major operating systems.

For Windows: Currently, the CI will only build wheels for 64-bit Python (amd64). This is due to the libraries bundled with conda being 64-bit as well. This can be fixed by installing a 32-bit distribution of poppler. delvewheel is used to ensure that all non-system DLLs are bundled. This will not work on systems older than Windows 7 but I guess we can ignore that. I have tested this on my Windows 10 machine.

For Linux: The wheel has manylinux1 compatibility so it supports even the most ancient operating systems like CentOS 6. The latest poppler is compiled from source. I have tested this on an Ubuntu 20.04 system.

For macOS: I've used the cibuildwheel example and added the dependencies like for the test jobs. It manages to create a wheel but it's only a few kilobytes so I'm kind of sceptical about this. However, I don't have a macOS system to test this on. It would be nice if someone with a Mac could test this.

The generated wheels can be downloaded here: https://dev.azure.com/jhnnbr/pdftotext/_build/results?buildId=36&view=artifacts&pathAsName=false&type=publishedArtifacts

Closes: #29

@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.368% when pulling 21b5776 on bauerj:master into b4b8831 on jalan:master.

@bauerj
Copy link
Author

bauerj commented Mar 30, 2021

@jalan Did you have time to look at this?

@jalan
Copy link
Owner

jalan commented Apr 4, 2021

Thanks for looking into this. Sorry for the slow reply!

Yeah, the mac wheels are clearly missing the dependencies. I don't have any macOS systems to debug further.

But aside from that, I'm not sure how I would maintain and version this package if I included wheels like these. I wouldn't want to clone poppler from master directly and distribute that, because who knows if every commit is something that should go to users. I imagine users would prefer actual poppler releases. Okay, so I could follow poppler releases, and every time they do a release, I bump my own version and do a release. That doesn't sound so bad.

But... what about all the other libs I'd be distributing? Looks like the full list is

  • libexpat
  • libfontconfig
  • libfreetype
  • libjpeg
  • libopenjp2
  • libpoppler
  • libpoppler-cpp
  • libz

So would I also need to do a release every time one of those gets an update? That's not feasible. Maybe not every time, maybe just for security updates, because some of those libs have CVEs pretty often? But how would I keep track of all those?

Seems awfully complicated and requires me to make releases very often. I would definitely get behind 😭

@bauerj
Copy link
Author

bauerj commented Apr 4, 2021

Hey,

indeed, that's something I did not consider. I guess I could create a Github Action or Azure Pipeline that build the wheels weekly and checks whether the included libraries have changed (e.g. by comparing hashes). Then it could either open an issue to let you know or even release a new version automatically.

What do you think?

@jalan
Copy link
Owner

jalan commented Apr 9, 2021

That sounds like a good idea to investigate 👍

@adamjanovsky
Copy link

adamjanovsky commented Apr 26, 2022

@jalan I was wondering, is there any progress on this matter?

While pdftotext is awesome, it hurts that we can't get wheels with properly shipped binaries. The problem is that when I try to make wheels out of my package, this is insufficient since it depends on pdftotext that does not ship with the poppler binaries and other system-level dependencies.

Effectively, when one depends on pdftotext and wants to create self-contained wheels, one must redeclare the external modules that you define in your setup, which is quite unfortunate.

Any chance that we may see self-contained wheels in the future for pdftotext?

@jalan
Copy link
Owner

jalan commented May 5, 2022

@jalan I was wondering, is there any progress on this matter?

Not at this time, no.

Effectively, when one depends on pdftotext and wants to create self-contained wheels, one must redeclare the external modules that you define in your setup, which is quite unfortunate.

Do you have an example of such a project that you've made wheels for? I'm curious to see what all libraries are included in it and how updates are managed.

@adamjanovsky
Copy link

I'm not sure if I'm following you. I didn't yet build the wheels in my project, cos now it seems overly complicated to transitively track dependencies that I believe others should care about.

I understand that it is difficult to keep track of latest versions of the involved libraries w.r.t. releases. However, I'm wondering what is the current state? This burden is left on the end users of your library. Right now, they must:

  • Make sure that they have a working version of poppler (and other stuff) installed.
  • If CVE is announced, should they update to new version of affected system package? Are they guaranteed it will work with pdftotext?
  • If they want to keep updating to :latest, they must check whether that version works with pdftotext as well.

What I'm trying to say is that the current state is not easier for the community, but it surely is for the maintainer (you). But I want to highlight that I appreciate your effort and the hurdles that you would face with delivering wheels with binaries.

I was thinking of suggesting Dependabot for automating updates, but that is easier said than done, see dependabot/dependabot-core#2129. What @bauerj suggested above may well work for your case actually.

@jalan
Copy link
Owner

jalan commented May 13, 2022

Closing, since there's nothing productive happening here.

@jalan jalan closed this May 13, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prebuilt binaries
4 participants