Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Towards Roman Level 2 ASDF data product support #1822

Merged
merged 36 commits into from
Apr 14, 2023

Conversation

bmorris3
Copy link
Contributor

@bmorris3 bmorris3 commented Nov 3, 2022

Background

This PR implements basic support for Roman Level 2 data products in Imviz. This effort was begun by @tddesjardins in his prototyping PR #1361. I am further revising and testing a version of that prototype in this branch/PR. Many of the revisions I've made to the implementation by @tddesjardins came from comments in a helpful review by @pllim in June.

Test data

The Roman team is producing an ever-updating set of synthetic data products. I wrote this PR around a specific set of synthetic data, which you can read more about in the collapsed section below:

Old background on test data

Outdated: Test data

I am working with simulated exposures in the Roman ASDF format, which are used by the concept notebook. Some notes on these simulated exposures:

  • The uncalibrated, synthetic, WFI imaging-mode exposures were produced in the SOC Build 22Q4_B7 Test Suite (this link is only for STScI internal folks, sorry others!)
  • The simulated exposures have stars, instrument artifacts, noise, realistic WCS, and more. None of the data is real, even the stars are randomly generated!
  • @tddesjardins then kindly ran romancal on the uncalibrated (Level 1) exposures to make the calibrated versions (Level 2)
  • @tddesjardins confirmed that these files can be shared publicly, so I've put them on Box for ease of experimenting. STScI folks: DM me if you want the shared paths for the files.
  • Each WFI exposure produces 18 ASDF files, one per detector. Each file is 390.3 MB. The Box directory linked above contains "one exposure" across 18 detectors, and weighs 7 GB.~

The PR will now run tests on tiny synthetic roman_datamodels.datamodels.ImageModel objects (~20x20 pixels, plus metadata), instead of the gigantic, synthetic data products.

Dependencies

This PR adds roman_datamodels as an optional dependency, and falls back on the asdf package if roman_datamodels is not available.

Demo

I added a concept notebook in notebooks/concepts/ImvizExample-asdf.ipynb, which is a variation on ImvizExample.ipynb adapted for the Roman ASDF L2 data files. Of course, I will plan to add more comprehensive documentation as POs and/or other devs request it. Though I'm not sure how urgent it is to add docs when no one outside of the Roman team knows how to procure the files that are supported by this PR 😅 .

Testing

I have not added any tests yet. I could use some ideas from other devs on how to approach testing. Perhaps it would it be wise to make a tiny version of a Roman ASDF file, perhaps one that just has a small subset of the data array, but all of the usual metadata. That file could be small enough to put on Box and add to the remote-data tests. Thoughts?

Speed

Testing with glue-core v1.6, I find the following load times for one 4k image (from a single WFI detector):

# load a single WFI detector image from ASDF
imviz.load_data:
    1.02 s ± 19.7 ms per loop

# alternate between removing and adding data
imviz.remove_data_from_viewer && imviz.add_data_to_viewer:
    123 ms ± 3 ms per loop

Fixes #1355
Supersedes / closes #1361
🐱

Change log entry

  • Is a change log needed? If yes, is it added to CHANGES.rst? If you want to avoid merge conflicts,
    list the proposed change log here for review and add to CHANGES.rst before merge. If no, maintainer
    should add a no-changelog-entry-needed label.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
  • Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
  • Do the proposed changes follow the STScI Style Guides?
  • Are tests added/updated as required? If so, do they follow the STScI Style Guides?
  • Are docs added/updated as required? If so, do they follow the STScI Style Guides?
  • Did the CI pass? If not, are the failures related?
  • Is a milestone set? Set this to bugfix milestone if this is a bug fix and needs to be released ASAP; otherwise, set this to the next major release milestone.
  • After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

@bmorris3 bmorris3 added feature Feature request imviz labels Nov 3, 2022
@github-actions github-actions bot added the documentation Explanation of code and concepts label Nov 3, 2022
@@ -50,9 +57,18 @@ def parse_data(app, file_obj, ext=None, data_label=None):
pf = rgb2gray(im)
pf = pf[::-1, :] # Flip it
_parse_image(app, pf, data_label, ext=ext)
else: # Assume FITS
elif file_obj_lower.endswith('.fits'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broke the test using download_file because the file comes back as contents without any extension. I think we should keep the else fallback and assume FITS if nothing else matches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you have a better way to figure out the filetype if the filename has no extension.

Copy link
Contributor Author

@bmorris3 bmorris3 Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, @pllim. Do you think it would be appropriate to make a tiny PR to astropy.utils.data that allows a user to discover file extensions of local caches of remote data? I think I have a working example, so I may open an astropy PR today. (update: no PR necessary!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented a possible workaround in b60872b. 🤞🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that is general enough. The solution needs to account for:

  1. App-wide access. Not only Imviz is affected by this.
  2. Possibility of URL pointing to a valid file but the URL itself has no file extension. (Unless we decide this is not worth supporting.)

I was thinking about adopting format= like https://docs.astropy.org/en/latest/table/io.html#getting-started so user can specify what format to look for if we cannot guess. Is that overkill?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, all parsers are affected by this problem. If that's something we want to fix generally, should that effort get its own ticket/issue so it can get prioritized/pointed? I'm not sure it is a pressing problem yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can defer, but you need to put "assume FITS" back in else for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! 👍🏻

@codecov
Copy link

codecov bot commented Nov 4, 2022

Codecov Report

Patch coverage: 34.83% and project coverage change: -0.32 ⚠️

Comparison is base (3fa6758) 91.94% compared to head (e3d7863) 91.62%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1822      +/-   ##
==========================================
- Coverage   91.94%   91.62%   -0.32%     
==========================================
  Files         146      147       +1     
  Lines       16012    16098      +86     
==========================================
+ Hits        14722    14750      +28     
- Misses       1290     1348      +58     
Impacted Files Coverage Δ
jdaviz/configs/imviz/helper.py 97.10% <ø> (ø)
jdaviz/configs/imviz/tests/test_parser_roman.py 10.00% <10.00%> (ø)
jdaviz/configs/imviz/tests/utils.py 84.80% <13.63%> (-15.21%) ⬇️
jdaviz/configs/imviz/plugins/parsers.py 89.34% <47.50%> (-8.22%) ⬇️
jdaviz/configs/imviz/tests/test_parser.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is ready to go forward, will need at least one test (possibly multiple to cover all the logic paths) and user facing documentation and declaring new dep in setup.cfg.

Change log can wait for now.

Thanks!

@@ -50,7 +71,14 @@ def parse_data(app, file_obj, ext=None, data_label=None):
pf = rgb2gray(im)
pf = pf[::-1, :] # Flip it
_parse_image(app, pf, data_label, ext=ext)
else: # Assume FITS
elif file_obj_lower.endswith('.asdf'):
if rdd is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone passes in ASDF file that isn't Roman? Is there a way to identify that it is actually Roman data before attempting to read with Roman package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The roman_datamodels docs page on the structure of Roman data products suggests that it is a "convention" that data are stored under a roman attribute. I've confirmed this in the L2 products. If you run:

import asdf

with asdf.open(path) as f:
    f.info()

you'll see

root (AsdfObject)
├─asdf_library (Software)
│ ├─author (str): The ASDF Developers
│ ├─homepage (str): http://github.com/asdf-format/asdf
│ ├─name (str): asdf
│ └─version (str): 2.13.0
├─history (dict)
│ └─extensions (list) ...
└─roman (WfiImage) # The schema for WFI Level 2 images.
...

Maybe the right thing to do is (1) open the file with the asdf package, (2) check for a roman attribute and load with roman_datamodels if so, (3) otherwise try passing the ASDF file without roman_datamodels.

jdaviz/configs/imviz/plugins/parsers.py Outdated Show resolved Hide resolved
def _roman_2d_asdf_to_glue_data(file_obj, data_label, ext=None):

if ext == '*':
ext_list = ['data', 'dq', 'err', 'var_poisson', 'var_rnoise']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this account for all the observing modes or whatever that Roman will churn out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was contributed by @tddesjardins, and the background on the available extensions in the roman_datamodels docs is rather sparse (e.g.: this page doesn't seem finished). Maybe @tddesjardins can comment?


for ext in ext_list:
comp_label = ext.lower()
new_data_label = f'{data_label}[{comp_label}]'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use Jesse's label generator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked around the rest of the imviz/plugins/parsers.py and found that there are still places where the data label is created by the parser, see for example:

new_data_label = f'{data_label}[{comp_label}]'

Jesse's data label method from #1672 is within app.py:Application, and these parser methods don't take the app as an argument, so we'd need to rework the arguments in each of the parser methods to use the centralized data labeler. Maybe @javerbukh can confirm – is there a way to outsource the data label creation linked above to the app's labeler? I'm not opposed to doing that work, but maybe that's a separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I did not know what to do with those parser methods when centralizing the label generation. I tried putting app as an argument but that caused a domino effect of issues so I backed out of that approach. I would think that handling that as a separate PR is a good call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imviz has its own rules since it can load wildcard extension and have both FITS and ASDF support, and so on. If the function is hidden, doesn't hurt if you also want to pass in app to them, but becareful to not break current Imviz labeling behavior. 🤪


yield data, new_data_label

# ---- Functions that handle input from non-JWST and non-Roman files -----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to add this new empty "section" or did you mean to replace the existing section below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 👍🏻

@bmorris3
Copy link
Contributor Author

bmorris3 commented Nov 4, 2022

declaring new dep in setup.cfg.

The roman_datamodels package is not a strict installation requirement, since we fall back on asdf if rdd is not available (see try: import block here). Should we require it in setup.cfg anyway?

@pllim
Copy link
Contributor

pllim commented Nov 4, 2022

Since it is in such early development, what about a new "roman" section here?

[options.extras_require]

docs/imviz/import_data.rst Outdated Show resolved Hide resolved
docs/imviz/import_data.rst Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
setup.cfg Show resolved Hide resolved
@bmorris3 bmorris3 added this to the 3.4 milestone Mar 17, 2023
Comment on lines 20 to 18
try:
# check for version of roman_datamodels
import roman_datamodels
RDM_LT_0_14_2 = Version(roman_datamodels.__version__) < Version('0.14.2.dev')
except ImportError:
# If roman_datamodels not installed, assume Roman-specific tests can be skipped
RDM_LT_0_14_2 = True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roman_datamodels > 0.14.1 will have tools for generating synthetic WFI-like data products on the fly. I've added a switch to allow tests to run when a sufficient version of roman_datamodels is installed.

@@ -327,7 +327,7 @@ def _roman_2d_asdf_to_glue_data(file_obj, data_label, ext=None):
component = Component.autotyped(np.array(getattr(file_obj, ext)), units=bunit)
data.add_component(component=component, label=comp_label)
meta = getattr(file_obj, 'meta')
data.coords = getattr(meta, 'wcs')
data.coords = getattr(meta, 'wcs', None)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the dev version of roman_datamodels has a method for generating WFI-like data products, but it doesn't generate artificial WCS yet. I've created an issue over there for requesting this: spacetelescope/roman_datamodels#134.

@bmorris3
Copy link
Contributor Author

bmorris3 commented Mar 17, 2023

Progress update: this PR was created when roman_datamodels was on v0.14.0, and was designed to successfully load a particular set of synthetic Roman data products – see the "Test data" heading in the PR description.

Since then, roman_datamodels has advanced, and as mentioned in #1822 (comment), there are now some tools for generating synthetic WFI image data models that would be very useful for testing in jdaviz. But since the PR was designed around the old synthetic data products, and they are no longer compatible with roman_datamodels >= 0.14.2, we need to either pin roman_datamodels or drop support for the old synthetic data. Since approximately no one cares at this stage, I'll do the latter.

As of f92fcb2, I'm beginning to develop tests based on the dev version of roman_datamodels, and I will drop support for the outdated test data in the next days.

Comment on lines 321 to 325
# this filtered warning can be removed after resolution of PR:
# https://github.com/spacetelescope/roman_datamodels/pull/138
@pytest.mark.filterwarnings(
'ignore:erfa.core.ErfaWarning: ERFA function "d2dtf" yielded 1 of "dubious year (Note 5)"'
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed when spacetelescope/roman_datamodels#138 is addressed.

setup.cfg Outdated
Comment on lines 64 to 66
roman =
roman_datamodels @ git+https://github.com/spacetelescope/roman_datamodels.git
rad @ git+https://github.com/spacetelescope/rad.git
Copy link
Contributor Author

@bmorris3 bmorris3 Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tddesjardins taught me that roman_datamodels==0.14.0, rad==0.14.0 support old synthetic data products, like the ones I originally built this PR to support. Newer synthetic data will require the dev versions (spacetelescope/roman_datamodels#133).

For now, I'm requiring dev versions of roman_datamodels and rad. We can minpin to the next release whenever it's available.

}
raw.update(kwargs)
raw["meta"]["photometry"] = create_photometry()
raw["meta"]["wcs"] = create_example_gwcs(image_shape)
Copy link
Contributor Author

@bmorris3 bmorris3 Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the main difference from the analogous method in roman_datamodels.testing.factories.create_wfi_image. We need a synthetic GWCS object to use in our example image file, so I adapted a photutils method for creating one, and attach it to the image model. See discussion in spacetelescope/roman_datamodels#134.

@@ -1,6 +1,6 @@
[tox]
envlist =
py{38,39,310,311}-test{,-alldeps,-devdeps,-predeps}{,-cov}
py{38,39,310,311}-test{,-alldeps,-devdeps,-predeps}{-romandeps}{,-cov}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a romandeps testing option, and a CI test that uses it (allowed_fail=True), but I'm not 100% sure I've done this sensibly. Tips welcome!

@bmorris3 bmorris3 marked this pull request as ready for review March 21, 2023 19:23
setup.cfg Outdated
@@ -61,6 +61,8 @@ test =
docs =
sphinx-rtd-theme
sphinx-astropy
roman =
roman_datamodels @ git+https://github.com/spacetelescope/roman_datamodels.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary, right? I don't think we can release to PyPI with a pin like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

CHANGES.rst Outdated
@@ -10,6 +10,20 @@ New Features

- Exact-text filtering for metadata plugin. [#2147]

- CLI launchers no longer require data to be specified. [#1890]

Mosviz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im assuming you just need to rebase which is why this extra stuff is in here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, I think this resulted from a bad rebase. I'll fix that now. Good catch!

@bmorris3
Copy link
Contributor Author

Thanks all!

@bmorris3 bmorris3 merged commit 45aed16 into spacetelescope:main Apr 14, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
documentation Explanation of code and concepts Extra CI Run cron jobs in PR feature Feature request imviz Ready for final review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEAT] Imviz: Add Roman ASDF Support
8 participants