Skip to content

In V2, creating an array with compressor=None assumes Zstd #2708

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
martindurant opened this issue Jan 14, 2025 · 5 comments · Fixed by #2709
Closed

In V2, creating an array with compressor=None assumes Zstd #2708

martindurant opened this issue Jan 14, 2025 · 5 comments · Fixed by #2709
Labels
bug Potential issues with the zarr-python library

Comments

@martindurant
Copy link
Member

martindurant commented Jan 14, 2025

Zarr version

3.0.1.dev9+g168999ce

Numcodecs version

Python Version

3.12

Operating System

linux

Installation

manual from source

Description

In zarr V2, creating an array would assume the default compression (blosc); to get uncompressed data, you would pass compression=None.

In v3, if writing zar_version=2 data, compression=None results in Zstd compression. The kwarg compression is marked as deprecated in favour of compressions=, but the latter is not allowed in v2.

Steps to reproduce

m = fsspec.filesystem("memory")
store = zarr.storage.FsspecStore(fsspec.implementations.asyn_wrapper.AsyncFileSystemWrapper(m))
g = zarr.open(store, mode="w", zarr_version=2)
g.create_array("name", dtype="i8", shape=(1, ), chunks=(1,), compressor=None, overwrite=True).compressor

results in Zstd(level=0).

Additional output

No response

@martindurant martindurant added the bug Potential issues with the zarr-python library label Jan 14, 2025
@d-v-b
Copy link
Contributor

d-v-b commented Jan 14, 2025

looks like a bug. None should definitely mean "no compression", not "choose a compressor for me"

@martindurant
Copy link
Member Author

I should have noted that None is the default kwarg value, so we can't tell if it's been passed or not. I believe V2 had **kwargs on these methods, so that you can tell the difference.

@d-v-b
Copy link
Contributor

d-v-b commented Jan 14, 2025

we moved away from **kwargs because it hides the true signature of the function. We started using auto as a value to convey "use the default", so I think we could just change the default value of compressor to auto here, and also ensure that we translate compressor=None to the right value.

@martindurant
Copy link
Member Author

OK, that makes sense, especially if V3 will tend to use compressors= (with the s), which indeed has default value "auto".

Should I do this one?

@d-v-b
Copy link
Contributor

d-v-b commented Jan 14, 2025

if you like, that would be great!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants