Skip to content

The DCPL branch #140

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 103 commits into from
Apr 7, 2021
Merged

The DCPL branch #140

merged 103 commits into from
Apr 7, 2021

Conversation

mulimoen
Copy link
Collaborator

@mulimoen mulimoen commented Feb 25, 2021

This is a continued effort on the dcpl-branch of @aldanor. This adds a lot more functionality and makes the API much nicer to use for selections.

See mulimoen#2 for some earlier comments and discussions regarding this branch.

Fixes #45
Fixes #88
Fixes #126

licenses/lzf.txt Outdated
@@ -0,0 +1,33 @@
Copyright Notice and Statement for LZF filter
Copy link
Owner

@aldanor aldanor Apr 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder now if we have to include licenses for lzf/blosc/h5py: because we no longer include C sources for lzf/blosc (which, I think, was the case early on) - as we rely on blosc-sys and lzf-sys. So I think the entire licenses/ can be now probably removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rework of automatic chunking also means we now no longer have any direct code from h5py. We also comply with the hdf5 license by including this in the hdf5-src archive. We should be safe to remove licenses (but I am not a lawyer).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think they are safe to remove now.

}
}

macro_rules! impl_builder_stuff {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder - are the generated docs ok with these macros?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs are replicated on all relevant types, this is duplication, but it all shows as a separate impl block. I've added a small comment on the impl blocks to describe how they are all common between dataset-builders

@aldanor
Copy link
Owner

aldanor commented Apr 6, 2021

@mulimoen I've went through the code once again and (the lack of docs and docstrings aside, and also the complete lack of tests aside) it's pretty good to go. I left a few minor comments here and there.

(I really think we should write some docs and tests after this is released and is considered stable api-wise.)

@mulimoen mulimoen force-pushed the feature/dcpl-2021 branch from 949a274 to 092bab3 Compare April 7, 2021 15:16
@mulimoen mulimoen mentioned this pull request Apr 7, 2021
@mulimoen
Copy link
Collaborator Author

mulimoen commented Apr 7, 2021

I've opened some new GH issues for some of the remaining tasks. This PR is enourmous as it is, and the other issues can be merged and reviewed separately.

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

Good point re: separate tickets (array trait, docs, avoiding panics), they can be handled separately. Maybe also a ticket for adding full tests for dataset and all related builders etc? A lot of the code in this PR is fully covered by tests (independent parts like extents, selection, dyn value, etc). What needs to be thoroughly tested is all of the new builders and the new Dataset - also lzf/blosc compression and all the new features.

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

At this point I'm somewhat scared to press the green button because this PR is so big my browser sometimes hangs when viewing diffs :) Great work @mulimoen in resurrecting it and finishing all the unfinished parts!

(Before the release, as I've said, I think we should consider merging at least the ownership/IDs PR)

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

(Added a note that this fixes #47 as well)

@mulimoen
Copy link
Collaborator Author

mulimoen commented Apr 7, 2021

(Added a note that this fixes #47 as well)

Only halfway, one would need to dispatch based on some fixed sizes.

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

(Added a note that this fixes #47 as well)

Only halfway, one would need to dispatch based on some fixed sizes.

Hm? The question was to create a dataset with runtime-known string size. This is now doable via empty_as() and with_data_as()?

@mulimoen
Copy link
Collaborator Author

mulimoen commented Apr 7, 2021

I think what @pmarks is trying is creating the type from an n that is only known at runtime (max of all strings), but one still needs to define this type at compile time to use at runtime. The workaround in #47 (comment) is using a giant match to work around this by letting each arm have the compile time fixed value. I'll remove the issue so it's not closed on merge.

@mulimoen mulimoen merged commit 63bb0b1 into aldanor:master Apr 7, 2021
@mulimoen
Copy link
Collaborator Author

mulimoen commented Apr 7, 2021

Thank you very much for reviewing and finishing this @aldanor! Hopefully future PRs won't be as massive as this one!

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

Yea - well we solved it halfway by allowing runtime descriptors, that's already something that wasn't previously possible, so that's something.

Also, that 'giant match' can now be implemented via const generics.

Finally, with strings... I have a ticket with a hacky solution for Rust types like strings - maybe we'll get to that at some point as well.

@aldanor
Copy link
Owner

aldanor commented Apr 7, 2021

Thank you very much for reviewing and finishing this @aldanor! Hopefully future PRs won't be as massive as this one!

Yea, let's try to keep it simple from now on :) I may open a few PRs in the nearby future with some docs updates.

@mulimoen mulimoen deleted the feature/dcpl-2021 branch April 8, 2021 16:53
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
3 participants