Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Specviz: load a SpectrumList and combine everything #1014

Merged
merged 9 commits into from
Mar 1, 2022
Merged

Specviz: load a SpectrumList and combine everything #1014

merged 9 commits into from
Mar 1, 2022

Conversation

ibusko
Copy link
Contributor

@ibusko ibusko commented Dec 22, 2021

Description

This pull request is to address a potential issue in PR #891. It was suggested that one of the possible default behaviors upon data ingestion is to combine all elements in the input SpectrumList into a single Spectrum1D instance. The current default behavior (in PR #891) is to simultaneously select all separate data elements and display then as separate spectra in a single viewer. It was found that code downstream (e.g. plug-ins) may fail when driven from such a viewer. Combining everything in a single Spectrum1D instance bypasses these issues.

A possible drawback is that one additional element is added to the data collection, which can be confusing to the user.

The code in this PR is actually a combination of the code already in PR #891, with new code that implements the spectrum combination. The code from PR #891 is commented out, so the new behavior is what shows up when installing this branch. Which behavior will ultimately be chosen depends on further decision.

There is a new concepts notebook that exercises the new code.

image

(this fit is scientifically meaningless and just made to show that we can fit a model over several spectrum elements in the data collection)

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
  • Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
  • Do the proposed changes follow the STScI Style Guides?
  • Are tests added/updated as required? If so, do they follow the STScI Style Guides?
  • Are docs added/updated as required? If so, do they follow the STScI Style Guides?
  • Did the CI pass? If not, are the failures related?
  • Is a change log needed? If yes, is it added to CHANGES.rst?
  • Is a milestone set? Milestone is only currently required for PRs related to Imviz MVP.
  • After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

@github-actions github-actions bot added documentation Explanation of code and concepts specviz labels Dec 22, 2021
@ibusko ibusko marked this pull request as draft December 22, 2021 19:15
@pllim pllim added this to the 2.2 milestone Dec 22, 2021
@codecov
Copy link

codecov bot commented Dec 22, 2021

Codecov Report

Attention: Patch coverage is 94.44444% with 2 lines in your changes missing coverage. Please review.

Project coverage is 76.62%. Comparing base (2010a0b) to head (90dcf68).
Report is 2548 commits behind head on main.

Files Patch % Lines
jdaviz/configs/specviz/plugins/parsers.py 94.44% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1014      +/-   ##
==========================================
+ Coverage   76.51%   76.62%   +0.11%     
==========================================
  Files          82       82              
  Lines        6463     6495      +32     
==========================================
+ Hits         4945     4977      +32     
  Misses       1518     1518              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rosteen rosteen modified the milestones: 2.2, 2.3 Dec 23, 2021
@ibusko ibusko marked this pull request as ready for review February 14, 2022 20:52
@ibusko ibusko requested review from kecnry, rosteen, orifox, PatrickOgle and eteq and removed request for rosteen and kecnry February 14, 2022 20:57
@pllim
Copy link
Contributor

pllim commented Feb 23, 2022

@orifox will have a look and provide his feedback.

@orifox
Copy link
Contributor

orifox commented Feb 23, 2022

This looks great! @ibusko what is the algorithm you use to merge the spectra? The MRS notebook has some sample code already for how they did it manually, so it would be good to have consistency. (Note, try to make this a clear function, because in the future the merging algorithm may change.)

@orifox
Copy link
Contributor

orifox commented Feb 23, 2022

In any case, I leave for vacation tomorrow, so I approve.

@pllim
Copy link
Contributor

pllim commented Feb 23, 2022

@ibusko , does this mean we can delete the commented code and clean this PR up?

@ibusko
Copy link
Contributor Author

ibusko commented Feb 23, 2022

@pllim yes, I think that clears the issue. I deleted the commented-out code.

As for cleaning the extra commits in the PR, again, I can't see then on my local repo. I get just the commits that are meaningful and I want to keep them in there. I am doing a git rebase against upstream/main. I wonder if github is clever enough to know that these commits came from updating the branch, that is, in the sense upstream/main -> branch. If so, then their tags should be already present in upstream/main and won't be duplicated. I am afraid my git knowledge doesn't go that deep :-(

@ibusko
Copy link
Contributor Author

ibusko commented Feb 23, 2022

@orifox the algorithm came from this notebook: https://github.com/spacetelescope/dat_pyinthesky/blob/main/jdat_notebooks/MRS_Mstar_analysis/JWST_Mstar_dataAnalysis_analysis.ipynb

There is a comment in the code with this link.

@pllim
Copy link
Contributor

pllim commented Feb 23, 2022

@ibusko , would you mind if I try to rebase for you?

@orifox
Copy link
Contributor

orifox commented Feb 23, 2022

Great! Thanks @ibusko. I see the algorithm. Perfect.

@ibusko
Copy link
Contributor Author

ibusko commented Feb 23, 2022

@plim please, yes, go ahead and try to rebase it.

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am approving by proxy since @orifox is happy with this. Another dev approval and this can be merged. Thanks!

@ibusko

This comment was marked as resolved.

@pllim

This comment was marked as resolved.

Copy link
Member

@kecnry kecnry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept notebook works fine on my end. This seems to also load each file as a separate layer, but they're disabled by default - is that by design? Do we want any top-level switches that control whether they're combined and which is enabled by default? (Or should we consider that out-of-scope and for future improvements?)

Copy link
Collaborator

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few comments, and I just looked at the concept notebook. The example files, reduced.tar.gz, look like some custom files, all named combine_dithers_all_exposures_xxxx.fits. Can someone provide an example notebook or code of how to use this to load actual JWST data files, i.e. named pipeline output? Are these level 2 or level 3 files? Which JWST instruments does it support? I have lingering questions on how MAST can use this functionality. I think the documentation should be updated with the answers to these kinds of questions and reflect how downstream users should be using this functionality.

@@ -11,6 +11,8 @@ New Features
- The line analysis plugin now includes logic to account for the background
continuum. [#1060]

- Specviz can load a ``SpectrumList`` and combine all its elements into a single spectrum. [#1014]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the Sphinx documentation to include some fleshed out examples of the different types of input into load_spectrum and how to load data?

@@ -40,6 +44,8 @@ def specviz_spectrum1d_parser(app, data, data_label=None, format=None, show_in_v
data_label = [data_label]
elif isinstance(data, SpectrumList):
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog says Specviz can now accept SpectrumList objects but there is a pass here? So does it support SpectrumList or just a list of files, which is what line 47-49 looks like.

Copy link
Collaborator

@rosteen rosteen Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question, I'll see if it works to just drop in a SpectrumList to fill in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading a SpectrumList does work, I assume the pass here is just a reminder that SpectrumList is an acceptable input, but one that we don't have to do any special processing for.

Copy link
Contributor Author

@ibusko ibusko Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be my fault. The changelog entry should read instead "Specviz can load a list of spectra and combine...". The requirement was (at that time) to read that, and not a SpectrumList instance. I think because SpectrumList was still in some sort of unfinished state. Or, if memory serves well, perhaps because it was already capable of handling a SpectrumList instance. The PR is about reading a bunch of x1d files in a single swoop, given the path to the directory. And, of course, combine all in a single Spectrum1D that won't crash the model fitter (the original motivation for this work).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the regular parser can already handle SpectrumList then in that case, it might be good to add a code comment at line 42 mentioning why there's a pass there, for posterity. The scope of the PR here is fine. I think there should be some more documentation to make it clear for the user what are acceptable inputs, formats, usage, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a few updates that I think address your comments, mind taking a look? The documentation now says you can load a SpectrumList but that giving it a directory name will probably work for JWST files and little else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the added section is good. I just had a few (optional) suggestions / questions on clarification of a few things. Those could be potential points of confusion for the user.

@@ -50,6 +56,8 @@ def specviz_spectrum1d_parser(app, data, data_label=None, format=None, show_in_v
except IORegistryError:
# Multi-extension files may throw a registry error
data = SpectrumList.read(str(path), format=format)
elif path.is_dir():
data = SpectrumList.read(str(path), format=format)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be some checks here to make sure the files in the directory are valid? What does SpectrumList.read do here when a directory of junk files are passed in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what kind of directory should this expect? It looks like a flattened directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks are part of the SpectrumList.read functionality. I can't find in the code how the path is expanded (glob? @rosteen ), but I think typically it will be a flattened directory. Typical use would be to plot together the several x1d pieces associated to a MIRI LRS spectral data set, as done in the notebook.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging by the concept notebook and the testing I've done, it looks like this expects a flat directory (or anyway just reads any matching files in the directory level given), in the case of the concept notebook reading in and combining only the x1d files there (I believe the MIRI MRS specutils reader is doing the work, since iirc that was the specific case for which this was requested).

I think there is a check needed - I tried this on a directory with two SDSS spectra in it, and SpectrumList.read() returns an empty list in that case since the SDSS reader isn't set up to handle the directory case with SpectrumList. I can add a warning in that case that the directory-read functionality is currently JWST specific, and add a bit of documentation about what will happen if you load in a pre-created SpectrumList.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that the specutils docs on this are pretty sparse, and mostly consist of fairly opaque example code from the JWST readers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this works with level 2 JWST x1d files, but my guess is the loaders only work on level 3 data. I never checked that, just always assumed it only works on L3. I'll also add that this functionality will be useful for MAST as well, but files on disk for MAST have a structured directory system rather than flattened. That can be a future work item.

@ibusko
Copy link
Contributor Author

ibusko commented Feb 28, 2022

@havok2063 the example notebook uses MIRI LRS Level 3 files, if I recall it right.

@rosteen
Copy link
Collaborator

rosteen commented Feb 28, 2022

@ibusko Is it ok for me to push up a couple minor changes + documentation?

@ibusko
Copy link
Contributor Author

ibusko commented Feb 28, 2022

@rosteen please, go ahead and do it. I am out of Indigo tomorrow on, so you'll be doing me a favor!

Comment on lines 71 to 72
In addition to loading single spectra as above, in some cases it may be useful to load multiple related
spectra at once into the Jdaviz application. The :meth:`jdaviz.configs.specviz.helper.Specviz.load_data` accepts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should be more explicit on what "related spectra" means here. From the concept notebook, it looks like it means 1d spectra from the same observation on a given instrument. But would this work if a user gave a list of spectra of the same target from different instruments, or from different wavelength regimes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work for the cases you listed, SpectrumList is a very flexible container that is mostly just a list of spectra. But I'll have to check the stitching algorithm. I'll clarify that the suggested use case is for multiple observations of the same object that cover different wavelength ranges.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me. I think once you've updated the docs to your own satisfaction, I unofficially approve this PR.

Comment on lines 74 to 75
objects in the list and additionally attempt to stitch together the spectra into a single data object so that
they can be manipulated and analyzed in the application as a single entity::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, we might want to add a few sentences on how the stitching works. Or how Jdaviz attempts to combine the spectra? Is the stitching automatic or does the user have control over it from a plugin, layer, or notebook?

rosteen added 2 commits March 1, 2022 10:17
Trying again...
Remove trailing whitespace

add missing whitespace

Last fix?

Hack around inherited ref
Co-authored-by: Kyle Conroy <kyleconroy@gmail.com>
@rosteen rosteen merged commit f7ccdcf into spacetelescope:main Mar 1, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants