Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

glob() recursively matches on * #98

Closed
nh2 opened this issue Jul 9, 2023 · 4 comments · Fixed by #99
Closed

glob() recursively matches on * #98

nh2 opened this issue Jul 9, 2023 · 4 comments · Fixed by #99

Comments

@nh2
Copy link
Contributor

nh2 commented Jul 9, 2023

This a bug made it into the Python 3.12 stdlib.

Issue

When you have a .zip containing a/b/c.txt, and on the top-level do .glob('*.txt'), it produces a match when it shouldn't.

.glob() is accidentally implemented to always behave like the recursive rglob() if a * is used.

Repro

Using the released Python 3.12 zipfile stdlib; the same happens with current zipp:

#! /usr/bin/env bash
set -e

mkdir -p zipp-testdir/a/b/
touch zipp-testdir/a/b/c.txt

rm -f zipp-testdir.zip
zip -r zipp.zip zipp-testdir

python3 --version
python3 -c 'import zipfile; print(list( zipfile.Path(zipfile.ZipFile("zipp.zip")).glob("*.txt") ))'

Prints

[Path('zipp.zip', 'zipp-testdir/a/b/c.txt')]

when it shouldn't.

Reason

This code

zipp/zipp/__init__.py

Lines 370 to 375 in ee6d711

matches = re.compile(fnmatch.translate(pattern)).fullmatch
return (
child
for child in self._descendants()
if matches(str(child.relative_to(self)))
)

uses fnmatch.translate() when the fnmatch docs say

Note that the filename separator ('/' on Unix) is not special to this module. See module glob for pathname expansion

@nh2

This comment was marked as off-topic.

@nh2
Copy link
Contributor Author

nh2 commented Jul 9, 2023

Unfortunately the test suite for .glob() is extremely sparse and does not check such things.

I added a (failing) test for this in PR #99.

@nh2

This comment was marked as off-topic.

@jaraco
Copy link
Owner

jaraco commented Jul 12, 2023

Thanks for the report. This functionality was added in #85 (zipp 3.11) and Python 3.12 (not a regression). The early implementation attempted to re-use the selection functionality from pathlib, which was messy because it required constructing a Path object with all of the platform-specific behaviors that brought. I re-wrote the implementation to have a native implementation, but clearly missed some important cases.

It sure would be nice to have fixes for these. I'm not entirely confident a proper solution can be developed in short order, though I plan to work on it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants