Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Avoid zip extract racing condition by using read+write instead extract #5707

Merged
merged 1 commit into from
Jul 7, 2021
Merged

Avoid zip extract racing condition by using read+write instead extract #5707

merged 1 commit into from
Jul 7, 2021

Conversation

gaborbernat
Copy link
Contributor

Extract also creates the folder hierarchy, however we do not need that,
the file itself being extracted to a temporary folder is good enough.
Instead we read the content of the zip and then write it. The write is
not locked but it's OK to update the same file multiple times given the
update operation will not alter the content of the file. By not creating
the folder hierarchy (default via extract) we no longer can run into the
problem of two parallel extracts both trying to create the folder
hierarchy without exists ok flag, and one must fail.

Resolves #5223.

Signed-off-by: Bernát Gábor bgabor8@bloomberg.net

Extract also creates the folder hierarchy, however we do not need that,
the file itself being extracted to a temporary folder is good enough.
Instead we read the content of the zip and then write it. The write is
not locked but it's OK to update the same file multiple times given the
update operation will not alter the content of the file. By not creating
the folder hierarchy (default via extract) we no longer can run into the
problem of two parallel extracts both trying to create the folder
hierarchy without exists ok flag, and one must fail.

Resolves #5223.

Signed-off-by: Bernát Gábor <bgabor8@bloomberg.net>
return extracted_path


@contextlib.contextmanager
def atomic_open(filename):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to handle a further race condition that can happen between opening and writing the file.

@gaborbernat
Copy link
Contributor Author

@sigmavirus24 @nateprewitt would you be able to accept this, so we can fix a long outstanding pip issue? Thanks!

@gaborbernat
Copy link
Contributor Author

@sigmavirus24 @nateprewitt do you have any plan to have a look at this?

@nateprewitt nateprewitt added this to the 2.26.0 milestone May 20, 2021
@sigmavirus24 sigmavirus24 merged commit 2463074 into psf:master Jul 7, 2021
@nateprewitt nateprewitt mentioned this pull request Jul 9, 2021
jonfung-scale pushed a commit to jonfung-scale/jonfung-requests that referenced this pull request Jul 14, 2021
psf#5707)

Extract also creates the folder hierarchy, however we do not need that,
the file itself being extracted to a temporary folder is good enough.
Instead we read the content of the zip and then write it. The write is
not locked but it's OK to update the same file multiple times given the
update operation will not alter the content of the file. By not creating
the folder hierarchy (default via extract) we no longer can run into the
problem of two parallel extracts both trying to create the folder
hierarchy without exists ok flag, and one must fail.

Resolves psf#5223.

Signed-off-by: Bernát Gábor <bgabor8@bloomberg.net>
jonfung-scale pushed a commit to jonfung-scale/jonfung-requests that referenced this pull request Jul 14, 2021
psf#5707)

Extract also creates the folder hierarchy, however we do not need that,
the file itself being extracted to a temporary folder is good enough.
Instead we read the content of the zip and then write it. The write is
not locked but it's OK to update the same file multiple times given the
update operation will not alter the content of the file. By not creating
the folder hierarchy (default via extract) we no longer can run into the
problem of two parallel extracts both trying to create the folder
hierarchy without exists ok flag, and one must fail.

Resolves psf#5223.

Signed-off-by: Bernát Gábor <bgabor8@bloomberg.net>
jonfung-scale pushed a commit to jonfung-scale/jonfung-requests that referenced this pull request Jul 15, 2021
psf#5707)

Extract also creates the folder hierarchy, however we do not need that,
the file itself being extracted to a temporary folder is good enough.
Instead we read the content of the zip and then write it. The write is
not locked but it's OK to update the same file multiple times given the
update operation will not alter the content of the file. By not creating
the folder hierarchy (default via extract) we no longer can run into the
problem of two parallel extracts both trying to create the folder
hierarchy without exists ok flag, and one must fail.

Resolves psf#5223.

Signed-off-by: Bernát Gábor <bgabor8@bloomberg.net>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2021
@gaborbernat gaborbernat deleted the 5223 branch December 28, 2021 19:23
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running requests in parallel from a zip archive can create race condition when unpacking the cacerts.pem file
3 participants