Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Re-evaluate the default normalize_keys for FSStore #739

Closed
joshmoore opened this issue May 12, 2021 · 4 comments · Fixed by #755
Closed

Re-evaluate the default normalize_keys for FSStore #739

joshmoore opened this issue May 12, 2021 · 4 comments · Fixed by #755

Comments

@joshmoore
Copy link
Member

For nested filesets using FSStore, it's currently a requirement to set normalize_keys to False on construction. @shoyer pointed out in #546 (comment) that False is the default for the other stores.

cc: @will-moore @martindurant

@joshmoore joshmoore mentioned this issue May 12, 2021
@martindurant
Copy link
Member

As I said in the linked PR, I'm happy to have the default change for the sake of consistency

@jakirkham
Copy link
Member

cc @shoyer

@d70-t
Copy link
Contributor

d70-t commented May 19, 2021

I just stumbled across this behavior as well.
In order to create datasets on IPFS, I currently first create a dataset locally and then add it into IPFS. This meant that the datasets are written without normalization and read back with normalization.
While I might have to think a bit about how the need for normalization might affect my use case, it definitely surprised me to see this kind of different behavior.

One particular thing which I found interesting is related to consolidated metadata. Within my consolidated metadata, the variables showed up with capital letters and thus, they also showed up with capital letters in the listing of my dataset (via xarray). But when accessing the variable, all values have been read back as nan, because none of the chunk files could be reached due to path normalization on access. I am wondering if the metadata of a dataset should (could?) tell the reader if reading the dataset requires to access it without normalization, but maybe that's also not the right way of thinking about the issue.

@joshmoore
Copy link
Member Author

Opened #755, let's see how the tests do.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants