Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft release feature on main archive to allow testing a release before it goes live #726

Open
njsmith opened this issue Oct 17, 2015 · 63 comments
Labels
feature request help needed We'd love volunteers to advise on or help fix/implement this. testing Test infrastructure and individual tests UX/UI design, user experience, user interface

Comments

@njsmith
Copy link
Contributor

njsmith commented Oct 17, 2015

As per this discussion thread:

https://mail.python.org/pipermail/distutils-sig/2015-October/thread.html#27234

it would be very nice if there where better ergonomics around package uploads -- in particular some way to upload a new release, and then take a look over it to double-check that everything is correct before you -- as a second step -- hit the button to make it "go live". Glyph suggests that in particular he'd like to be able to actually run a test install against the uploaded data as an end-to-end test:

https://mail.python.org/pipermail/distutils-sig/2015-October/027259.html

which indeed sounds glorious, and I think the super-slick way to do this would be to provide a unique private index URL which gives a view of what pypi would look like after your release goes live, and could be used like

pip install --index-url https://pypi.python.org/tmp/SOMEPKG/acd1538afe267/ SOMEPKG

(https://mail.python.org/pipermail/distutils-sig/2015-October/027263.html)

The idea would be basically that any request to /tmp/SOMEPKG/acd1538afe267/WHATEVER would return the same thing as a request to /WHATEVER, except for requests that would be affected by the addition of the new release, would act as if the release had already been made.

The use of a unique URL for each trial upload means that this still plays nicely with caching/CDNs. The inclusion of the package name in the tmp URL allows people to double-check that if they see a URL like this, then they know that the files there were actually uploaded by someone who is trusted to upload that package. You'd want to expire them after some short-but-reasonable time (a few days?) to prevent them being abused as private indices by unscrupulous people, and also just for general hygiene, but that's fine.

Obviously this is very much a post-"become PyPI" wishlist priority request.

@nlhkabu nlhkabu added the UX/UI design, user experience, user interface label Oct 18, 2015
@njsmith
Copy link
Contributor Author

njsmith commented Jan 7, 2016

Another advantage of this feature would be that it would provide a way for distributed teams to coordinate releases. E.g. one could use a workflow like:

  • main release manager uploads (hopefully) final source release to one of these staging areas
  • the volunteer who provides osx builds fetches the source from this staging area, builds wheels, and then uploads the wheels to the staging area. (This could even be done via an automated script, or even a fully automated build bot).
  • once all the wheels have been uploaded, the main release manager double-checks things, and then hits the switch to make the release live
  • the source and binary releases appear on pypi in a single atomic transaction

We've had problems in the past with numpy where we used a similar workflow but directly on the real pypi, so that there was a period between steps one and two where the source release was live and the binary release was not. Since pip prefers a 1.10.3 sdist to a 1.10.2 wheel, any users who try to install during this gap suddenly revert from the wheel experience to the far far inferior sdist experience. The solution of course is to coordinate everything offline and like email wheels back and forth to collect them all on a single person's machine before doing the upload in one go. But this is difficult and unpleasant.

@dstufft
Copy link
Member

dstufft commented Jan 7, 2016

You don't need to collect onto a single users machine. You just need to upload all of the wheels first but that's still obviously not as nice of an experience. I'm not opposed to this feature though. It's just not a priority at the moment.

Sent from my iPhone

On Jan 7, 2016, at 4:32 PM, Nathaniel J. Smith notifications@github.com wrote:

Another advantage of this feature would be that it would provide a way for distributed teams to coordinate releases. E.g. one could use a workflow like:

main release manager uploads (hopefully) final source release to one of these staging areas
the volunteer who provides osx builds fetches the source from this staging area, builds wheels, and then uploads the wheels to the staging area. (This could even be done via an automated script, or even a fully automated build bot).
once all the wheels have been uploaded, the main release manager double-checks things, and then hits the switch to make the release live
the source and binary releases appear on pypi in a single atomic transaction
We've had problems in the past with numpy where we used a similar workflow but directly on the real pypi, so that there was a period between steps one and two where the source release was live and the binary release was not. Since pip prefers a 1.10.3 sdist to a 1.10.2 wheel, any users who try to install during this gap suddenly revert from the wheel experience to the far far inferior sdist experience. The solution of course is to coordinate everything offline and like email wheels back and forth to collect them all on a single person's machine before doing the upload in one go. But this is difficult and unpleasant.


Reply to this email directly or view it on GitHub.

@sigmavirus24
Copy link
Contributor

It also means there's an extra step on top of twine upload

@njsmith
Copy link
Contributor Author

njsmith commented Jan 7, 2016

We hesitated on just uploading wheels before the sdist first because we had no idea what pip would do if it saw that a 1.10.3 release was available, but only in formats that can't be used by the current machine. I'm actually not sure whether in this case I'd prefer pip to error out or to silently fall back to an earlier version, though obviously the fallback is better for handling this particular situation. And you also still have the problem of how to distribute the sdist to the build volunteers if not via pypi.

But yeah, it's manageable. The main advantage of this approach is the testing use case described in the OP; I just wanted to make a note that it is also have other advantages. (And of course these are complementary -- you can use the same staging area to first assemble the wheels and then to test the whole assemblage.) Numpy's had to skip release numbers twice in the last few months due to issues interacting with pypi. I'm not saying this is pypi's fault -- I think one was user error and one was a network error -- but rather just, there's a lot of advantages to reducing the cost of errors.

@dstufft
Copy link
Member

dstufft commented Jan 7, 2016

That's easy enough to manage. Just have it default to immediate publish with a flag to enable the two step upload.

Sent from my iPhone

On Jan 7, 2016, at 4:38 PM, Ian Cordasco notifications@github.com wrote:

It also means there's an extra step on top of twine upload


Reply to this email directly or view it on GitHub.

@dstufft
Copy link
Member

dstufft commented Jan 7, 2016

It'll install the older sdist.

Sent from my iPhone

On Jan 7, 2016, at 5:45 PM, Nathaniel J. Smith notifications@github.com wrote:

We hesitated on just uploading wheels before the sdist first because we had no idea what pip would do if it saw that a 1.10.3 release was available, but only in formats that can't be used by the current machine.

@ncoghlan
Copy link
Contributor

@dstufft, from a twine UX perspective, I take it you mean something like twine upload --no-publish <artifacts>? I think that would work well, especially if it wrote out the private URL so you could subsequently do twine publish <unpublished release URL>

@dstufft
Copy link
Member

dstufft commented Jan 22, 2016

@ncoghlan Yea something like that. We'll want to consider what it means if/when we move twine into pip too. But either way I think it's reasonably doable in a way that doesn't change the default behavior but makes it easy to opt in.

@takluyver
Copy link
Contributor

I saw a link to this in the context of replacing TestPyPI. It's an awesome idea which I'd love to see implemented, but I'd like to note that it's not the only use case for TestPyPI. It's also useful for:

  • Demonstrating and learning about packaging, to fully publish a sample package without taking up a name on PyPI for a package with no utility.
  • Testing packaging tools and techniques.

So I hope that there will continue to be a test server after this happens. :-)

@dstufft
Copy link
Member

dstufft commented Aug 5, 2017

Those are interesting use cases, which I feel like the existing TestPyPI does a pretty poor job of handling right now as well. The key difference I think is that both of those things would be best served by something with ephemeral names that automatically expired after some period of time. So I think they're still going to require some broad changes in how TestPyPI functions, and it may make sense to roll those into PyPI at large, but I can see a use for them.

@ncoghlan
Copy link
Contributor

ncoghlan commented Aug 7, 2017

I filed #2286 as a separate issue to cover the learning & interoperability testing use cases for Test PyPI.

@ncoghlan
Copy link
Contributor

There's a fair bit of overlap between this issue and #720. See #720 (comment) for a more specific design proposal I put together that would give us 3 potential states for a release:

  • partially mutable (status quo): appears in the main index, components can be added or removed, but not replaced with something different
  • staging/unpublished: components can be freely added, removed, and replaced, but the doesn't appear in the main index
  • published/immutable: appears in the main index, and requires PyPI admin intervention to make any further changes

The choice between which kind of release process to use (immediate publication with partially mutable releases vs staged publication with immutable releases) would be made on a per-project basis.

@taion
Copy link

taion commented Nov 30, 2017

It sounds like it'd be simpler to just allow uploading multiple artifacts simultaneously. That's somewhat more work when cross-compiling, but it reduces extra state tracking. At some level perhaps it's just a tooling thing. For the average package that's just running something like:

$ twine upload dist/*

It seems like it should be possible to instrument the client to upload things in a different way.

@ncoghlan
Copy link
Contributor

We already support uploading multiple artifacts simultaneously. There are a few reasons it doesn't fully solve the problem:

  1. It doesn't help with testing whether or not the upload worked properly before making a release available to users. When combined with PyPI's prohibition on replacing published artifacts, a glitch in the artifact transmission can currently be a real pain (turning off the legacy upload server and requiring all uploads to go through Warehouse made this less frequent, but it's still technically possible).
  2. It doesn't allow for workflows where only the sdist is uploaded to the release by a human maintainer (or project-specific CI system), and then the wheels are automatically generated from that by a separate online service
  3. For non-automated build and upload processes, it doesn't readily allow for collaborative processes where different people are responsible for different artifacts (e.g. having a dedicated maintainer for a project's Windows wheel builds)

@njsmith
Copy link
Contributor Author

njsmith commented Nov 30, 2017

There probably would be some value in making multi-file upload an atomic transaction, though, so e.g. wheels and sdists appear simultaneously and if one upload fails the whole thing is rolled back to try again. I don't think that's what twine upload dist/* currently does. It's definitely not a replacement for real two-phase upload, but it has the nice property that it could be implemented without changing peoples workflows at all; twine would just become a tiny bit more robust. I don't know that it's a high priority though, unless it's really easy to implement.

@taion
Copy link

taion commented Nov 30, 2017

  1. I mean, it happens, but numbers are cheap. It's not the end of the world to bump the patch level to fix a broken publish.
  2. Is that a common workflow? Given that CI is so often tied to a Git repo, that the more common workflow would be the entirety of the build running after a new tag is pushed to the Git repo, in which case the sdist wouldn't be staged on PyPI.
  3. How common of an occurrence is this? I guess it just seems like it'd be a lot easier to route those through the package maintainer and ask the maintainer to upload all the artifacts simultaneously, rather than coordinating via PyPI for everyone to upload to a release sitting in "staging".

@ncoghlan
Copy link
Contributor

@taion It's not an either/or situation: we can do both, and several of the technical building blocks will be shared. I do agree that if running twine upload for multiple artifacts isn't already an atomic operation, enabling that would be the place to start.

The main concrete benefit I personally see to going further and actually exposing a separate staging index is that it would offer us a path towards truly immutable releases that doesn't require unilaterally changing the rules for existing package publishers.

@taion
Copy link

taion commented Nov 30, 2017

That makes sense.

I will also note, though, that as a user, I don't really care that much about immediate, hard immutability.

To offer a strawman, if, instead of "immutable", you had "immutable after 24 hours", your flows (2) and (3) above for publishers would still generally work. And while it's certainly possible that, as a user, I'll upgrade immediately to a new release of a dependency, but will then get bitten by mutability; in practice I'm much more likely to get bitten by bugs in the release itself.

The really frustrating thing is instead if I have some set of dependencies that I haven't touched in a while that suddenly starts breaking because something changed.

@brainwane
Copy link
Contributor

@alanbato would like comments on this proposal by 30 April (10 days from now). I posted some background context on distutils-sig.

@brainwane
Copy link
Contributor

@brainwane
Copy link
Contributor

@alanbato How's this going?

@alanbato
Copy link
Contributor

I haven't had the chance to work on the proof of concept, as I'm going thorugh the last weeks of getting my degree.
Once that's done I'll spin up a PR with a PoC so people can comment on the implementation rather than the idea (since nobody had nothing else to say).
For the unanswered questions, I'll make my own decisions and assumptions and I have something to show I'm sure it'll be easier to make a decision.

@brainwane
Copy link
Contributor

@alanbato Congrats on the final weeks of getting your degree! Should we expect to hear from you again around, say, 7 July?

@alanbato
Copy link
Contributor

alanbato commented Jun 2, 2020

@brainwane Yes, I'll have something to show by then :)

@brainwane
Copy link
Contributor

Hi, @alanbato! How is the feature coming along?

@di
Copy link
Member

di commented Jul 17, 2020

how installing an unpublished release should work with pip (obfuscated simple index and --index-url?)

After experimenting with @alanbato on this today, we determined this should be --extra-index-url instead, so the "draft"/obfusticated index only needs to include the draft release of the draft package, and doesn't need to include every other package available.

(@ewdurbin originally said this here)

@arigo
Copy link

arigo commented Sep 17, 2020

I think that PyPI is getting closer to this feature nowadays with the ability to "yank" and "un-yank" releases. You can make but "yank" your release as quickly as possible, which makes it not used by default pip install myproject. It will be used by an explicit pip install myproject==1.3. When everything seems to work, "un-yank" it. There is still a window at the start during which a high-volume project is likely to get a few downloads from unsuspecting users---which will then be stuck with your release candidate even if you yank it as long as they are using the same virtualenv, as far as I could test---but that's still progress.

If only we could create a new release initially in the "yanked" state, we would actually have a way to do atomic PyPI releases.

@brainwane
Copy link
Contributor

@alanbato How's this going?

@alanbato
Copy link
Contributor

alanbato commented Nov 3, 2020

Hey @brainwane! So, the current status is this:

  • The proof of concept is feature-complete and it works locally.
  • I wrote some tests but there are some missing in order to get 100% coverage.
  • The write up, including the rationale and the decisions we took is still to be written (to be included in the PR with these changes)
  • I have a PR against my fork with the changes, waiting on a pre-review from @di to check that everything looks good before asking for more eyes to look a it. The PR with the Warehouse code is here.
  • There's also the accompanying PR to my fork of Twine where in a very non-elegant manner I got it to work with this new feature, thus testing manually that this works "end to end". I do not expect to submit this PR, as a Twine contributor would have a better idea on how to actually implement it, and this was just for testing purposes.

Once @di has time to review, I'll make the necessary changes and get it ready for a formal PR to this repo :)

@brainwane
Copy link
Contributor

#8941 is now ready for review!

@dstufft
Copy link
Member

dstufft commented Jun 28, 2022

There is now PEP 694: Upload 2.0 API for Python Package Repositories, which has discussions on discuss.python.org which is relevant to this issue.

@pradyunsg
Copy link
Contributor

[from first post]

pip install --index-url https://pypi.python.org/tmp/SOMEPKG/acd1538afe267/ SOMEPKG

One thing I should note here: it would be neat to have the custom index pages only contain the draft releases and rely on installation tools' functionality (or middleware) to tack that onto regular PyPI via:

pip install --extra-index-url https://pypi.python.org/draft-release/SOMEPKG/acd1538afe267/ SOMEPKG

That way, if you're testing a project that depends on a draft release of scipy and numpy, you could do:

pip install --extra-index-url https://pypi.python.org/draft-release/scipy/8344rg4nfx83z7gnf2/ --extra-index-url https://pypi.python.org/draft-release/numpy/fgsghhmq39x4gx31ql/ awesome-package

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request help needed We'd love volunteers to advise on or help fix/implement this. testing Test infrastructure and individual tests UX/UI design, user experience, user interface
Projects
None yet
Development

No branches or pull requests