Skip to content

publish wheel for apache-flink-libraries #26844

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Aug 15, 2025
Merged

Conversation

dimbleby
Copy link
Contributor

What is the purpose of the change

Publish a wheel for apache-flink-libraries

Installation in python is always from wheel. So if maintainers do not publish a wheel then every install has to build one: which is slower, and can go wrong. Better to publish once and for all.

Also direct execution of setup.py is deprecated. I see that you are using uv, so I have used uv build to build distributions.

Brief change log

Publish a wheel for apache-flink-libraries

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

no to all

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

no

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 29, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@davidradl
Copy link
Contributor

@dimbleby Please could you follow https://flink.apache.org/how-to-contribute/contribute-code/ and start the tile of the PR with either [Jira number] or [hotfix].

@dimbleby
Copy link
Contributor Author

looks like too much trouble, sorry.

obvs I think you should be publishing wheels, but I am not invested enough to be sinking time into it

@dianfu
Copy link
Contributor

dianfu commented Aug 14, 2025

Hey, apache-flink-libraries only hold a few JAR packages, do you really find it slow to install? Since it doesn't contain cython files, I suspect wheel package is faster than sdist.

@dianfu
Copy link
Contributor

dianfu commented Aug 14, 2025

Here you can find more background on why it doesn't publish wheel for apache-flink-libraries: #26897 (comment)

@dimbleby
Copy link
Contributor Author

dimbleby commented Aug 14, 2025

Since it doesn't contain cython files, I suspect wheel package is faster than sdist.

DId you mean to write that the other way round? Or perhaps you are agreeing with me?

The contents are essentially irrelevant, even pure python packages should publish wheels. Installation in python is always from wheel, so it is more or less impossible for sdist to be faster than wheel: installing from sdist means first building the wheel, and then installing from wheel.

Another good reason to publish wheels is so as not to expose users to possible errors in that sdist->wheel build process. Eg in recent times both setuptools and wheel have published breaking changes such that existing sdists failed to build. For projects that publish wheels that's mostly a non-issue - or anyway an issue only for maintainers - because users will successfully install from wheel anyway.

Edit: in the linked comment you wrote

wheel packages are usually for cython files

that's just not correct and perhaps this is the misunderstanding

@dianfu
Copy link
Contributor

dianfu commented Aug 15, 2025

@dimbleby Your points are generally correct. However, it may not hold for the package apache-flink-libraries. Just as explained in #26897 (comment), the main aim of apache-flink-libraries is to reduce the package size.

Besides, I found that pyspark doesn't publish wheel package (it contains only pure Python code), see https://pypi.org/project/pyspark/4.0.0/#files for more details. So I guess it may not be that serious in practice.

@dimbleby
Copy link
Contributor Author

Not sure what pyspark has to do with this but it is an outlier - eg see https://pythonwheels.com/ - perhaps I will raise an issue at their repository too.

I really think you are making this harder than it needs to be! Your package is not unusually big, you do not publish unusually often, including a wheel is simply the normal and expected thing to do.

If you absolutely insist on publishing only one distribution, it would be better for users if that were the wheel, rather than the sdist. You could make the sdist available eg through github. But publishing both should be fine.

@dianfu
Copy link
Contributor

dianfu commented Aug 15, 2025

Your package is not unusually big, you do not publish unusually often, including a wheel is simply the normal and expected thing to do.

Oh, this is not true. There are currently 12 wheel packages for apache-flink, see https://pypi.org/project/apache-flink/#files for more details. The size of apache-flink-libraries is 309 MB, this means that if we publish wheel packages for it, it will take about 3.6GB (309 MB * 12) for each release.

Regarding the release frequency, the major release is about 5 months, however, we should also consider bugfix release, which is about one to two months.

@dianfu
Copy link
Contributor

dianfu commented Aug 15, 2025

I don't want to change anything unless it's clear in which case users will encounter problems.

@dimbleby
Copy link
Contributor Author

There would be exactly one wheel for apache-flink-libraries, not twelve, as you do not have anything platform-specific in it.

"I don't want to" is not something I can disagree with!

@dianfu
Copy link
Contributor

dianfu commented Aug 15, 2025

There would be exactly one wheel for apache-flink-libraries, not twelve, as you do not have anything platform-specific in it.

OK. Then it's fine for me~

@dianfu dianfu merged commit 4cdc679 into apache:master Aug 15, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants