Skip to content

Metadata comparison fails for NaN fill_values #2929

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
TomNicholas opened this issue Mar 24, 2025 · 0 comments
Open

Metadata comparison fails for NaN fill_values #2929

TomNicholas opened this issue Mar 24, 2025 · 0 comments
Labels
bug Potential issues with the zarr-python library

Comments

@TomNicholas
Copy link
Member

Zarr version

main

Numcodecs version

n/a

Python Version

3.12

Operating System

linux

Installation

pip editable

Description

Two Metadata objects with identical attributes will compare not equal if they both have NaN for a fill_value. This is because the __eq__ check introspects deeper until it finds e.g. the np.float32(nan) type, but

In [4]: bool(np.float32('nan') == np.float32('nan'))
Out[4]: False

(See https://stackoverflow.com/a/10059796 for why numpy NaNs behave like this.)

The solution needs to be to actually check two Metadata classes are __eq__ with dedicated code, not just trusting the python dataclasses' automatically-generated __eq__ method to do it correctly.

xref zarr-developers/VirtualiZarr#501

Steps to reproduce

In [12]: metadata1 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [13]: metadata2 = ArrayV3Metadata(
    ...:     shape=(2,),
    ...:     data_type=np.float32,
    ...:     chunk_grid={
    ...:         "name": "regular",
    ...:         "configuration": {"chunk_shape": (2,)},
    ...:     },
    ...:     chunk_key_encoding={"name": "default"},
    ...:     fill_value=np.float32('nan'),
    ...:     codecs=({'name': 'bytes', 'configuration': {'endian': 'little'}},),
    ...:     attributes={},
    ...:     dimension_names=None,
    ...:     storage_transformers=None,
    ...: )

In [14]: bool(metadata1 == metadata2)
Out[14]: False

Additional output

No response

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

1 participant