feature: add new "lazy-wheel" config option #8815

radoering · 2023-12-21T18:29:18Z

Pull Request Check List

Added tests for changed code.
Updated documentation for changed code.

With #5509, this is only relevant for indexes that do not support PEP 658 - thus, probably for most indexes except for PyPI - or old wheels without PEP 658 metadata.

This PR adds a config option solver.lazy-wheel. If active and a server supports HTTP range requests, we do not have to download entire wheels to fetch metadata but can just download the METADATA file from the remote wheels with a couple of range requests. Especially with slow network connections this setting can speed up dependency resolution significantly. (If the cache has already been filled or the server does not support HTTP range requests, this setting makes no difference.)

This PR is based on pypa/pip#12208, which I learned about at PackagingCon, so all credits to the authors of this pip PR. 👏 I just had to make the interface fit for Poetry, write some tests and fix some special cases I encountered.

I measured the time of poetry lock of a largish project, once from a machine with a slow network connection (1.5 MB/s, 40 ms ping) and once from a machine with a fast network connection (95 MB/s, 1 ms ping) to the package index.

operation	slow network	fast network
cold cache without PR	825 s	75 s
cold cache with PR	190 s	60 s
warm cache without/with PR	24 s	16 s

What lazy-wheel basically does is:

try a range request with a negative offset in order to fetch the central directory of the wheel zip file
if negative range requests are not supported, send a HEAD request to get the size of the wheel and afterwards send a normal range request to fetch the last bytes of the wheel
if the central directory is quite large we may need an additional range request to fetch the rest of the central directory
look for the position of the METADATA file in the central directory and download it in one range request

In order to not try range requests again and again in vain, we keep track if a domain supports range requests at all and especially with negative offsets.

In contrast to the pip PR, we only fetch the METADATA file instead of the entire .dist-info directory. On the machine with a slow network connection this made a difference of 190 s to 330 s.

Further, I added some handling for special cases I encountered while trying range requests with different servers:

Server	Accept-Ranges	Negative Offset	Negative Offset > entire wheel
pypi.org	bytes	501 Unsupported client range	501 Unsupported client range
Internal devpi server 1	bytes¹	206 Partial content	⚠️ 200 OK, entire wheel
Internal devpi server 2	-	200 OK, entire wheel	200 OK, entire wheel
Internal Artifactory	bytes	206 Partial content, ⚠️ data from start of file	206 Partial content, entire wheel
Internal GitLab	bytes	206 Partial content	206 Partial content, entire wheel
piwheels.org	bytes	206 Partial content	206 Partial content, entire wheel
download.pytorch.org	bytes	206 Partial content	206 Partial content, entire wheel

¹ "Devpi server 1" hosts several indexes including a PyPI mirror. I noticed that range requests were supported for wheels that have been downloaded before, but were not supported for wheels that have not been downloaded before. In other words, a server might supports range requests for some but not all wheels. (This is handled in HTTPRepository.)

The behavior of the different servers can be interpreted as follows:

The internal GitLab, piwheels.org and download.pytorch.org behave as you would expect from a server that supports range requests with negative offsets.
PyPI behaves as you would expect from a server that supports range requests but not negative offsets.
The internal devpi server 2 behaves as you would expect from a server that does not support range requests.
The internal devpi server 1 has a quirk when the negative offset is greater than the entire wheel. It does return 200 OK instead of 206 Partial content, so you have to inspect the Accept-Ranges header to determine if it supports range requests or not.
The internal Artifactory has the most unexpected behavior. It does not support negative offsets but interprets it as positive. E.g. -100 is handled as 0-100.

In case you were wondering at the beginning why we should introduce a config option and not just always use range requests if possible, it's probably clearer now: Although various servers were tested and special cases were implemented, it's not unlikely that there are still servers with unexpected behavior that is not handled well.

src/poetry/inspection/lazy_wheel.py

github-actions · 2023-12-23T10:51:56Z

Deploy preview for website ready!

✅ Preview
https://website-5yh7exrea-python-poetry.vercel.app

Built with commit 48e0b57.
This pull request is being automatically deployed with vercel-action

src/poetry/inspection/lazy_wheel.py

tests/inspection/test_lazy_wheel.py

tests/repositories/test_http_repository.py

src/poetry/console/commands/config.py

src/poetry/inspection/lazy_wheel.py

If active and a server supports range requests, we do not have to download entire wheels to fetch metadata but can just download the METADATA file from the remote wheel with a couple of range requests.

… `from_wheel_metadata` to `from_metadata`

…zeLazyResource to LazyRemoteFile

…e to LazyFileOverHTTP

…methods

Secrus

LGTM, one small nitpick left.

src/poetry/inspection/lazy_wheel.py

github-actions · 2024-03-03T18:45:52Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

radoering force-pushed the lazy-wheel branch from f474a8a to 83706ba Compare December 21, 2023 18:33

This was referenced Dec 21, 2023

perform 1-3 HTTP requests for each wheel using fast-deps pypa/pip#12208

Open

1.8.0 Release #8770

Closed

Secrus reviewed Dec 22, 2023

View reviewed changes

src/poetry/inspection/lazy_wheel.py Outdated Show resolved Hide resolved

radoering added the impact/docs Contains or requires documentation changes label Dec 23, 2023

radoering mentioned this pull request Dec 23, 2023

repositories: add support for PEP 658 #5509

Merged

radoering requested a review from a team December 30, 2023 16:32

radoering force-pushed the lazy-wheel branch from 296761d to 2e797c7 Compare January 4, 2024 20:34

trag1c suggested changes Jan 7, 2024

View reviewed changes

radoering force-pushed the lazy-wheel branch from ae6b9a5 to b70ac4a Compare January 10, 2024 05:38

trag1c approved these changes Jan 10, 2024

View reviewed changes

src/poetry/console/commands/config.py Outdated Show resolved Hide resolved

radoering force-pushed the lazy-wheel branch 2 times, most recently from 7e35c2b to be15429 Compare January 13, 2024 06:15

Secrus reviewed Jan 20, 2024

View reviewed changes

src/poetry/inspection/lazy_wheel.py Outdated Show resolved Hide resolved

radoering added 8 commits January 20, 2024 17:40

feature: add new "lazy-wheel" config option

df62035

If active and a server supports range requests, we do not have to download entire wheels to fetch metadata but can just download the METADATA file from the remote wheel with a couple of range requests.

use packaging.metadata instead of pkginfo

f10e006

refactor: rename from_metadata to from_metadata_directory, rename…

346b7df

… `from_wheel_metadata` to `from_metadata`

apply review feedback

f25832f

remove one level of inheritance: merge LazyRemoteResource and FixedSi…

31b5c06

…zeLazyResource to LazyRemoteFile

remove one level of inheritance: merge LazyRemoteFile and LazyHTTPFil…

44a2570

…e to LazyFileOverHTTP

consistent order of methods: magic methods, public methods, internal …

ec2d0f3

…methods

use type vars to avoid override only done for type checking

8309ad2

radoering force-pushed the lazy-wheel branch from 46ed988 to 8309ad2 Compare January 20, 2024 16:43

Secrus previously approved these changes Jan 20, 2024

View reviewed changes

src/poetry/inspection/lazy_wheel.py Outdated Show resolved Hide resolved

remove redundant information

48e0b57

radoering dismissed Secrus’s stale review via 48e0b57 January 20, 2024 18:31

Secrus approved these changes Jan 20, 2024

View reviewed changes

radoering merged commit 50a7723 into python-poetry:master Jan 20, 2024
33 checks passed

dimbleby mentioned this pull request Jan 30, 2024

Instructions for installing PyTorch #6409

Open

1 task

radoering mentioned this pull request Feb 17, 2024

release: bump version to 1.8.0 #8985

Merged

github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: add new "lazy-wheel" config option #8815

feature: add new "lazy-wheel" config option #8815

radoering commented Dec 21, 2023

github-actions bot commented Dec 23, 2023 •

edited

Loading

Secrus left a comment

github-actions bot commented Mar 3, 2024

feature: add new "lazy-wheel" config option #8815

feature: add new "lazy-wheel" config option #8815

Conversation

radoering commented Dec 21, 2023

Pull Request Check List

github-actions bot commented Dec 23, 2023 • edited Loading

Secrus left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 3, 2024

github-actions bot commented Dec 23, 2023 •

edited

Loading