-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
ENH: Imviz parser optimizations #2176
Conversation
Not sure if I can get to this next week. I am okay with this if two other devs approve. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me - just double checking my understanding, this is making the assumption that the only asdf
files that Imviz will see are Roman files, right? And we're ok to make that assumption? I guess JWST uses asdf-in-FITS rather than having actual asdf files for images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, I'm ok assuming asdf
files are Roman data for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small suggestion to the test, but otherwise looks good enough to me now that we've decided on the ASDF assumption question.
Co-authored-by: Kyle Conroy <kyleconroy@gmail.com>
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #2176 +/- ##
==========================================
+ Coverage 91.50% 91.72% +0.21%
==========================================
Files 147 147
Lines 16142 16182 +40
==========================================
+ Hits 14771 14843 +72
+ Misses 1371 1339 -32
☔ View full report in Codecov by Sentry. |
One more follow-up. I made a demo notebook to test how efficiently |
Thanks all! 🎉 |
Description
Imviz, and especially its parser, will do some heavy lifting for Roman files. I'm doing some profiling, and found that the parser tweaks in this PR speed up the parser by a factor of 3 on my machine (the machine matters for reasons I'll explain below).
The speedups come from the following corrections:
contents
. I worked around this earlier by making a reverse-lookup dict to find source URLs for each file in the cache (see below), which is actually quite slow if the cache is heavily used. Since I have 12,000 files in my cache (😱), this lookup took up a lot of time. In this PR, I avoid the lookup unless the filename iscontents
(no extension).jdaviz/jdaviz/configs/imviz/plugins/parsers.py
Lines 55 to 66 in aec6ed1
asdf.open
to check if it looks like a Roman data file. If not it raises and error, and if it's a Roman file it passes the open file object toroman_datamodels.datamodels.open
to extract anImageModel
. This double-open incurs unnecessary overhead and is probably more thorough than it ought to be – I'd prefer a faster parser with a less-informative traceback when it fails to open an ASDF file with unknown provenance. Our docs say we only (partially) support Roman ASDF files.jdaviz/jdaviz/configs/imviz/plugins/parsers.py
Lines 79 to 91 in aec6ed1
Here's a breakdown of the time spent in loading a single Roman file, which decreases from 3 to 1 seconds after this PR on my laptop. The first bullet above describes the
path_to_url_mapping
change, the second bullet results in removing theasdf.open
call. I've also included a note on the one call that gets slower as a result of these changes, which is_parse_image
. Since we had passed an open filestream fromasdf.open
to_parse_image
before, the update in this PR slows down the call to_parse_image
which now must open the file stream.path_to_url_mapping
asdf.open
_parse_image
asdf.open
Change log entry
CHANGES.rst
? If you want to avoid merge conflicts,list the proposed change log here for review and add to
CHANGES.rst
before merge. If no, maintainershould add a
no-changelog-entry-needed
label.Checklist for package maintainer(s)
This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.
trivial
label.