Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BUG] Specviz: Fix app.py so subset can apply to multiple data #1921

Closed
javerbukh opened this issue Dec 14, 2022 · 7 comments · Fixed by #2087
Closed

[BUG] Specviz: Fix app.py so subset can apply to multiple data #1921

javerbukh opened this issue Dec 14, 2022 · 7 comments · Fixed by #2087
Labels
bug Something isn't working specviz

Comments

@javerbukh
Copy link
Contributor

javerbukh commented Dec 14, 2022

Describe the bug
TLDR: Core methods in app.py assume only one data is loaded at a time, or a subset is applied to only one data at a time.

There are a few issues that this ticket will be pointing out. I will try to link to the code and provide examples where I can

app.get_data_from_viewer() returns all data that is loaded into a particular viewer. The only time this is not the case is when a subset is applied to multiple data. In that case, only the last data the subset applies to (going by layer order) will be returned with data[‘Subset 1’].

data[label] = layer_data

The next issue is that app.get_subset_from_viewer() uses the data that the subset is applied to (line 680 above) to calculate the bounds of that subset

jdaviz/jdaviz/app.py

Lines 775 to 776 in 46f0ae2

for key, value in data.items():
if isinstance(value, Subset):

temp_data = self.get_data_from_viewer(viewer_reference, value.label)

jdaviz/jdaviz/app.py

Lines 826 to 828 in 46f0ae2

subregions_in_subset = _get_all_subregions(
np.where(~temp_data.mask)[0], # noqa
temp_data.spectral_axis)

This means that if Subset 1 actually applies to multiple data, only the bounds that apply to the latest data will be used. I attempted to fix that last point in this PR

https://github.com/spacetelescope/jdaviz/pull/1886/files

But then we realized that these lines

elif label and dimensions not in self.all_subsets_with_subregions[label]["dimensions"]:
self.all_subsets_with_subregions[label]["dimensions"].append(dimensions)

Append subregions to a subset, even if the user is just repositioning the subset. In order to fix this, we would have to include all the glue subset operators

https://jdaviz.readthedocs.io/en/latest/imviz/displayimages.html#defining-spatial-regions

shown at the bottom of those docs so we could arrange the subregions correctly.

We have discussed the possible solution of using a SubsetCollection type object to store all the data a subset is applied to, which would help with this particular problem. We have reached out to @astrofrog if there is something like that that already exists but it may be something we have to create in jdaviz.

Example of a Subset applied to multiple spectra:
screen_shot_2022-11-22_at_3 23 38_pm

@javerbukh javerbukh added bug Something isn't working specviz labels Dec 14, 2022
@pllim
Copy link
Contributor

pllim commented Dec 14, 2022

I might be in the wrong rabbit hole but to some degree, I was able to traverse the tree after I added the following method in CompositeSubsetState over at glue core. Would save us from reverse-engineering the __str__ outputs but as it is posted, the output requires some brain cells to parse.

https://github.com/glue-viz/glue/blob/583699bb13514473371e5059a6386e337677a92f/glue/core/subset.py#L1065

--- a/glue/core/subset.py
+++ b/glue/core/subset.py
@@ -1096,6 +1096,11 @@ class CompositeSubsetState(SubsetState):
         sym = OPSYM.get(self.op, self.op)
         return "(%s %s %s)" % (self.state1, sym, self.state2)

+    def traverse(self):
+        return (self.op,
+                self.state1 if not hasattr(self.state1, 'traverse') else self.state1.traverse(),
+                self.state2 if not hasattr(self.state2, 'traverse') else self.state2.traverse())
+

With the patch above at glue-core, you would get this in Specviz. Here, the composite subset is the second one I created and I used "AND" the first pass with 2 discontinuous regions, and then after that I got back and add XOR to the first sub-region. I don't know why each operator traversal returns in the format of (op, some_state, (op, left, right)) as I was expecting something slightly more straightforward.

>>> specviz.app.data_collection.subset_groups[1].subset_state.traverse()
(<function _operator.xor(a, b, /)>,
 <glue.core.subset.RangeSubsetState at 0x7f3ba140e440>,
 (<function _operator.xor(a, b, /)>,
  <glue.core.subset.RangeSubsetState at 0x7f3ba140ed10>,
  (<function _operator.or_(a, b, /)>,
   <glue.core.subset.RangeSubsetState at 0x7f3ba140e5f0>,
   (<function _operator.or_(a, b, /)>,
    <glue.core.subset.RangeSubsetState at 0x7f3ba14d58a0>,
    <glue.core.subset.RangeSubsetState at 0x7f3ba14d4d30>))))

Even without this patch, you can kinda see the traversal pattern by manually accessing the following:

>>> specviz.app.data_collection.subset_groups[1].subset_state.op
>>> specviz.app.data_collection.subset_groups[1].subset_state.state1
>>> specviz.app.data_collection.subset_groups[1].subset_state.state2
>>> specviz.app.data_collection.subset_groups[1].subset_state.state2.op  # and so on

@kecnry
Copy link
Member

kecnry commented Dec 21, 2022

Also affects the subset component's selected_subset_mask property (which probably should be refactored into a method which also takes the spectrum1d as input, similar to selected_min_max and is part of the issues causing #1910)

subset = self.app.get_data_from_viewer(
viewer_ref, data_label=self.selected
)

@astrofrog
Copy link
Collaborator

as I was expecting something slightly more straightforward

Just to check, what would you have liked to see? Currently you are seeing the result of the very generic way of combining subsets which might not always be e.g. spatial subsets but you could have subsets defined also by e.g. a contour/threshold or even more wild subsets, an each subset in the expression might depend on different data attributes. What I mean is that not all subsets will always be geometrical subsets in the plane of the image being viewed, so not always possible to write in a simpler way. Having said that maybe we could investigate having a way for viewer classes to optionally be able to simplify subset definitions on the fly when they can be simplified.

@pllim
Copy link
Contributor

pllim commented Jan 24, 2023

@astrofrog , did the tag-up just now answer your question above?

@astrofrog
Copy link
Collaborator

Yep!

@javerbukh
Copy link
Contributor Author

Hi @astrofrog here is an example of what we were talking about with the traverse method: glue-viz/glue#2354

@javerbukh
Copy link
Contributor Author

Here are the notes from the meeting that was had for this topic on 01/18/2023:

How do we want subset information returned?

  • as before (applied to each data and returned as a masked spectrum1d)
    • apply subsets keyword in get_data takes list of strings that are subset names and apply those subsets to the data
  • as an astropy region showing all bounds of subset, even if compound?
  • as a glue subset object (not sure how beneficial this would be)
  • something else?
    • define mask component for spectrum
    • do not want spectrum attaached to one particular mask
    • return as just a mask so that a user can use them as they wish in the notebook +
    • remove subsets from this method completely +
    • remove get_viewer from get subsets method (cubeviz case is special since a subset is spatial and also applied in spectrum-viewer as a collapsed function)
    • implement get_data_... at base helper class level so it can be public api
    • force get_data to require a data_label so data is returned along with associated subsets. This avoids returning as a dictionary where keys can be overwritten
    • get_data needs to know if data needs to be collapsed or not (use cls parameter, or overload cubeviz helper to have extra keyword arguement)
  • method for accessing data labels present/visible in jdaviz viewers

Path forward

We ended up deciding to do "something else", specifically implement a get_data method in the base helper class that returns data as type cls with subset subset_to_apply applied. This work is happening in #1984 . Implementing get_subsets will happen in a separate PR. Replacing get_data_from_viewer calls with get_data is also a separate effort.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working specviz
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants