-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
map raster #588
map raster #588
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #588 +/- ##
==========================================
+ Coverage 91.98% 92.00% +0.01%
==========================================
Files 43 44 +1
Lines 6560 6613 +53
==========================================
+ Hits 6034 6084 +50
- Misses 526 529 +3
|
Before the full review, two quick questions.
So, there is one test ( Update. |
What are your thought on this? I would either go for 1 or 2, and as explained above, I would add an helper function to avoid the need to go for 3. Update: I have implemented option 2. Happy to discuss. |
Another minor comment. I will remove the calls to |
I have reviewed the code, fixed the pre-commits, simplified the APIs while maintaining all the use cases of the tests. @giovp can you also review please? Some minor comments:
|
I addressed all the points that I left open; some tests are failing in the CI but not locally, I think they are due to the new support |
I have added the support for |
Think option 2 is indeed best as a user can still pull out a particular scale and use apply should the user want a different level. Also with the output the user can still create a multiscale again. |
I think this PR will require a notebook tutorial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no blockers from my side. Great work!
Yes, indeed, while I see value in being able to provide |
Regarding this part of the code, I have three comments:
If we want to keep the
I am happy to make these changes if you would agree |
I've added a unit test for altering dimension (c,z,y,x) -> (z,y,x), as it was still commented as a todo |
Thanks for the comments, I agree on all the changes, if you could make them it would be great thanks. |
Ok, will do! |
Done. I also changed the name of the parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really great @ArneDefauw , some general questions:
this seems a much more slimmed down version of #407 , in particular, it seems that no calls to dask image are really ever done, but that it just target use case of having became functions to be applied to raster data.
- Does this mean that there will be a separate API/method for calls to dask.image?
- For things like
region_props
, where both a label and an image layer are needed, would there also be a separate API/method or will be added to this function in the future?
Considering answers to this question, I wonder if the name of the function should be renamed?
These are all genuine questions, I think this is definitely ready to get in as is, and it's great, I just wanted to raise once more about the roadmap/use case of similar, yet not exactly the case, functionalities and whether they should/will be added to spatialdata.
f"Depth {depth} is provided for {len(depth)} dimensions. " | ||
f"Please (only) provide depth for {arr.ndim} dimensions." | ||
) | ||
kwargs["depth"] = coerce_depth(arr.ndim, depth) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while corce_depth
is a public function, it is not documented, I wonder if it's better to just be explicit about what to pass with depth (and basically raise error as above) instead . also again here, kwargs
is used to store intermediate arguments, that might already exists. I am not sure this is the best way to do it, could it also just save the result as variables and being passed explicitly later, if the arguments are not already in kwargs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If depth
would be passed in kwargs
, it will appear as an argument, not in kwargs
, so in that respect I do not see a problem.
Instead of putting depth
and chunks
in kwargs
, we could indeed also do the following:
map_func(func, arr, chunks=chunks, depth=depth, **fn_kwargs, **kwargs, dtype=arr.dtype)
We could also remove chunks
and depth
from the list of parameters altogether, and simply let users pass them via kwargs. In that scenario, I would also remove these sanity checks on depth
and delegate them to da.map_overlap
.
Re coerce_depth
. We have three options:
- Remove coerce_depth from our code, and let users pass whatever map_overlap can work with, i.e. we delegate conversion to a dict to map_overlap (which will also call coerce_depth).
- Remove coerce_depth, and only accept e.g. tuples; and add some sanity checks. But then we are a bit too restrictive I think, escpecially for users that are used to work with
map_overlap
and prefer to pass e.g. a dict. - Keep the code as is. Having coerce_depth in our code, makes it a bit more explicit as what is exactly passed to map_overlap, which could help debugging later on.
I have a preferance for 3, but not too strong of an opinion on it
c_coords = None | ||
if transformations is None: | ||
d = get_transformation(data, get_all=True) | ||
assert isinstance(d, dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would it not be a dict? I thought it should always be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When get_all=False
(default), d
is a BaseTransformation
. The assert
is to avoid mypy
errors.
Co-authored-by: Giovanni Palla <25887487+giovp@users.noreply.github.com>
Co-authored-by: Giovanni Palla <25887487+giovp@users.noreply.github.com>
for more information, see https://pre-commit.ci
Functionality of this PR is pretty similar as #407. The only difference is the removal of support for being able to process different channels/z-stacks with different Callable/fn_kwargs, as it introduces some complexity that is not really necessary, I think the user can simply write a for loop if different Callable/fn_kwargs need to be used, and then concat the DataArray as required. Re API/method for calls to dask.image. I would be happy to implement this, if this would be considered an added value. With the
Re Extracting Anyway, it would be good to have a discussion about the roadmap with respect to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks all good to me, thank you @ArneDefauw ! I've updated the docstrings striving to clarify but would ask you and @LucaMarconato to give it another read cause maybe there is still room for improvement. If they look good, we just need release note and then LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks all good to me, thank you @ArneDefauw ! I've updated the docstrings striving to clarify but would ask you and @LucaMarconato to give it another read cause maybe there is still room for improvement. If they look good, we just need release note and then LGTM!
Looks great! I will update the changelog and the docs and then merge! 🚀 |
looks good! Thanks @giovp only one comment:
If going from In practice, code should work for that use case I think, but need to test, I think we only need to change:
to
|
I enabled this with the line Therefore the lines |
But the model is obtained from the input data So this check is on input data: |
You are right, the check on |
Yes, will do! |
updated 'apply' function for raster data.
Following discussion #407, I decided to remove the 'combine_c', 'combine_z' parameters, and also support for different Callable's/fn_kwargs per channel/z-stack. I think it overcomplicates the code; the user can simply use a for loop if different Callables/fn_kwargs should be passed to different channels/z-stacks.
see unit tests for examples:
https://github.com/ArneDefauw/spatialdata/blob/b3e4487cbdfe34ccf38b5a311de1eb2a4b5427c4/tests/core/operations/test_map.py#L31
Signature is now as follows
I can not see how we can remove
scale_factors
orc_coords
from the signature without loosing some functionality.In order for
map_raster
to be able to alter dimension (e.g. (c,z,y,x)->(c,y,x) ), we would need also need to passdims
I think, see https://github.com/ArneDefauw/spatialdata/blob/b3e4487cbdfe34ccf38b5a311de1eb2a4b5427c4/tests/core/operations/test_map.py#L149