Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Don't allow restore app to overwrite metadata #307

Merged
merged 5 commits into from
Oct 25, 2023
Merged

Conversation

landmanbester
Copy link
Collaborator

This is a partial fix for #297. The remaining cases that are not catered for are:

  • restoring to a non-existent column (i.e. create a new column)
  • I believe there may also still be an issue when trying to restore to a non-standard column in an MS as it needs a a schema.

I believe these are both fairly exotic use cases but let me know if you want me to look into that @JSKenyon

@JSKenyon JSKenyon merged commit f8de83e into v0.2.1-dev Oct 25, 2023
@JSKenyon JSKenyon deleted the issue297 branch October 25, 2023 09:13
JSKenyon added a commit that referenced this pull request Jan 26, 2024
* assign to ms to avoid over-writing metadata in restore app

* zip datasets in enumerate

* add comment to document failure case

* use backup_column_name in restore app

* Apply OCD.

---------

Co-authored-by: landmanbester <lbester@ska.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
JSKenyon added a commit that referenced this pull request Jan 30, 2024
* Use nearest-neighbour interpolation in regions where extrapolation is required. (#285)

* Fix version drift.

* Bump to 0.2.0

* Use nearest-neighbour interpolation for points requiring extrapolation.

* Utilise environment variable when dask.address is unset. (#288)

* Fix version drift.

* Bump to 0.2.0

* Inspect envvar for scheduler address when one isn't specified.

* Encode environment varraible as ascii.

* Simplify.

* Add plotting functionality (#290)

* Fix version drift.

* Bump to 0.2.0

* Initial commit of basic plotting functionality.

* Change naming convention.

* Improve transform argument.

* Simplify transform selection.

* Add rudimentary time and frequency selection.

* Checkpoint ploter changes. Can now handle scans and spws, but is very slow.

* More work on plotter - can now plot datasets in parallel.

* Some tidying.

* Slightly improve plot speed. Dominant cost is still saving the figures.

* Commit some minor changes which speed up figure saving.

* Lots of tiny fixes.

* Tiny cosmetic changes.

* Add custom tick formatter so that plots are the same size regardless.

* Add matplotlib dependency.

* Rework construction of plotting dictionary. Add a few utility functions which will likely be useful in other places in QC.

* Rename variable to avoid confusion.

* Fix bug affecting recursive grouping.

* Avoid copies in grouping code.

* Checkpoint work on extending functionality.

* Make plotter more powerful. Add colourization option. Begin simplifying interface.

* Allow user specification of colourmap.

* Add plotsize parameter.

* Fix #293 - OOB access caused by `output.subtract_directions`  (#294)

* Fix version drift.

* Bump to 0.2.0

* Fix #293.

* Namedbackups (#296)

* Fix version drift.

* Bump to 0.2.0

* Add optional label and single field selection to backup app

* remove item instead of pop@index

* do not .remove() from xds_list

* Simplify using some existing functionality.

---------

Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
Co-authored-by: landmanbester <lbester@ska.ac.za>

* Selectively disable MAD flagging criteria (#298)

* Fix version drift.

* Bump to 0.2.0

* Setting MAD threshold to zero will disable flagging on a given statistic.

* Disable mad flagging on off-diagonals by default (#300)

* Fix version drift.

* Bump to 0.2.0

* Disable flagging based on off-diagonal correlations in the mad flagger by default. This should make the mad flagger less agressive on data with unmodelled polarised emission.

* Fix bug affecting non-standard columns in `input_ms.data_column` (#301)

* Fix version drift.

* Bump to 0.2.0

* Fix a bug afecting the use of non-standard columns in data column input.

* Don't allow restore app to overwrite metadata (#307)

* assign to ms to avoid over-writing metadata in restore app

* zip datasets in enumerate

* add comment to document failure case

* use backup_column_name in restore app

* Apply OCD.

---------

Co-authored-by: landmanbester <lbester@ska.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>

* Fix for summary reporting SOURCE_ID as FIELD_ID (#309)

* Fix version drift.

* Bump to 0.2.0

* Make summary correctly report FIELD_ID and SOURCE_ID.

* Fix receptor summary (#310)

* Fix version drift.

* Bump to 0.2.0

* Fix incorrect assumption that FEED substable will always have 2 receptors.

* Fix similar problem affecting parallactic angle construction.

* Update missing column selection for compatibility with upsteam changes.

* Fix xarray dims (#318)

* Fix version drift.

* Bump to 0.2.0

* Move all usage of xds.dims[dim] to xds.sizes[dim] in preparation for change of return type in xds.dims.

* Fixes for changes relating to Numba error types. (#319)

* Move now-deprecated graph metrics function into the scheduler plugin code. (#320)

* Make small changes to enable 3.11 compatibilty. Requires changes in stimela + a release. (#321)

* Restringify keys in scheduler plugin. (#322)

* Update pyproject.toml. Add poetry.lock. Update docs. (#323)

* Drop 3.8. Commit poetry lock file.

* Update stimela requirement.

* Update docs.

* Set min and max versions in pyproject.toml.

* Remove python3.8 from test matrix.

---------

Co-authored-by: Landman Bester <lbester@sarao.ac.za>
Co-authored-by: landmanbester <lbester@ska.ac.za>
JSKenyon added a commit that referenced this pull request Jan 31, 2024
* Cache NUMBA kernels between CI runs

* Use actions/cache@v3

* Cache per python version

* runner.tmp -> runner.temp

* Debugging

* Fix

* Run entire test suite

* timestamp needed otherwise cache hit occurs and cache not updated

* Fix output

* Add revert_me.txt

* Use nearest-neighbour interpolation in regions where extrapolation is required. (#285)

* Fix version drift.

* Bump to 0.2.0

* Use nearest-neighbour interpolation for points requiring extrapolation.

* Utilise environment variable when dask.address is unset. (#288)

* Fix version drift.

* Bump to 0.2.0

* Inspect envvar for scheduler address when one isn't specified.

* Encode environment varraible as ascii.

* Simplify.

* Add plotting functionality (#290)

* Fix version drift.

* Bump to 0.2.0

* Initial commit of basic plotting functionality.

* Change naming convention.

* Improve transform argument.

* Simplify transform selection.

* Add rudimentary time and frequency selection.

* Checkpoint ploter changes. Can now handle scans and spws, but is very slow.

* More work on plotter - can now plot datasets in parallel.

* Some tidying.

* Slightly improve plot speed. Dominant cost is still saving the figures.

* Commit some minor changes which speed up figure saving.

* Lots of tiny fixes.

* Tiny cosmetic changes.

* Add custom tick formatter so that plots are the same size regardless.

* Add matplotlib dependency.

* Rework construction of plotting dictionary. Add a few utility functions which will likely be useful in other places in QC.

* Rename variable to avoid confusion.

* Fix bug affecting recursive grouping.

* Avoid copies in grouping code.

* Checkpoint work on extending functionality.

* Make plotter more powerful. Add colourization option. Begin simplifying interface.

* Allow user specification of colourmap.

* Add plotsize parameter.

* Fix #293 - OOB access caused by `output.subtract_directions`  (#294)

* Fix version drift.

* Bump to 0.2.0

* Fix #293.

* Namedbackups (#296)

* Fix version drift.

* Bump to 0.2.0

* Add optional label and single field selection to backup app

* remove item instead of pop@index

* do not .remove() from xds_list

* Simplify using some existing functionality.

---------

Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
Co-authored-by: landmanbester <lbester@ska.ac.za>

* Selectively disable MAD flagging criteria (#298)

* Fix version drift.

* Bump to 0.2.0

* Setting MAD threshold to zero will disable flagging on a given statistic.

* Disable mad flagging on off-diagonals by default (#300)

* Fix version drift.

* Bump to 0.2.0

* Disable flagging based on off-diagonal correlations in the mad flagger by default. This should make the mad flagger less agressive on data with unmodelled polarised emission.

* Fix bug affecting non-standard columns in `input_ms.data_column` (#301)

* Fix version drift.

* Bump to 0.2.0

* Fix a bug afecting the use of non-standard columns in data column input.

* Don't allow restore app to overwrite metadata (#307)

* assign to ms to avoid over-writing metadata in restore app

* zip datasets in enumerate

* add comment to document failure case

* use backup_column_name in restore app

* Apply OCD.

---------

Co-authored-by: landmanbester <lbester@ska.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>

* Fix for summary reporting SOURCE_ID as FIELD_ID (#309)

* Fix version drift.

* Bump to 0.2.0

* Make summary correctly report FIELD_ID and SOURCE_ID.

* Fix receptor summary (#310)

* Fix version drift.

* Bump to 0.2.0

* Fix incorrect assumption that FEED substable will always have 2 receptors.

* Fix similar problem affecting parallactic angle construction.

* Update missing column selection for compatibility with upsteam changes.

* Fix xarray dims (#318)

* Fix version drift.

* Bump to 0.2.0

* Move all usage of xds.dims[dim] to xds.sizes[dim] in preparation for change of return type in xds.dims.

* Fixes for changes relating to Numba error types. (#319)

* Move now-deprecated graph metrics function into the scheduler plugin code. (#320)

* Make small changes to enable 3.11 compatibilty. Requires changes in stimela + a release. (#321)

* Restringify keys in scheduler plugin. (#322)

* Attempt very dodgy solution to caching problem.

* Look for code in the correct place.

* Update pyproject.toml. Add poetry.lock. Update docs. (#323)

* Drop 3.8. Commit poetry lock file.

* Update stimela requirement.

* Update docs.

* Set min and max versions in pyproject.toml.

* Remove python3.8 from test matrix.

* Some debugging.

* Fix unsaved file.

* More debugging.

* Temporarily make test suite much smaller.

* Fix path.

* Actually fix path.

* Attempt at safer caching.

* More fiddling with paths.

* Fix bad tabbing.

* Try to find out where things are failing.

* More fiddling.

* More fiddling.

* More fiddling.

* Try restore time action.

* Tidy up caching approach. Use action. Restore matrix and test everything.

* Remove tmp file.

* Reword CI step name.

---------

Co-authored-by: JSKenyon <jonosken@gmail.com>
Co-authored-by: Landman Bester <lbester@sarao.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
Co-authored-by: landmanbester <lbester@ska.ac.za>
JSKenyon added a commit that referenced this pull request Feb 2, 2024
* Cache NUMBA kernels between CI runs (#279)

* Cache NUMBA kernels between CI runs

* Use actions/cache@v3

* Cache per python version

* runner.tmp -> runner.temp

* Debugging

* Fix

* Run entire test suite

* timestamp needed otherwise cache hit occurs and cache not updated

* Fix output

* Add revert_me.txt

* Use nearest-neighbour interpolation in regions where extrapolation is required. (#285)

* Fix version drift.

* Bump to 0.2.0

* Use nearest-neighbour interpolation for points requiring extrapolation.

* Utilise environment variable when dask.address is unset. (#288)

* Fix version drift.

* Bump to 0.2.0

* Inspect envvar for scheduler address when one isn't specified.

* Encode environment varraible as ascii.

* Simplify.

* Add plotting functionality (#290)

* Fix version drift.

* Bump to 0.2.0

* Initial commit of basic plotting functionality.

* Change naming convention.

* Improve transform argument.

* Simplify transform selection.

* Add rudimentary time and frequency selection.

* Checkpoint ploter changes. Can now handle scans and spws, but is very slow.

* More work on plotter - can now plot datasets in parallel.

* Some tidying.

* Slightly improve plot speed. Dominant cost is still saving the figures.

* Commit some minor changes which speed up figure saving.

* Lots of tiny fixes.

* Tiny cosmetic changes.

* Add custom tick formatter so that plots are the same size regardless.

* Add matplotlib dependency.

* Rework construction of plotting dictionary. Add a few utility functions which will likely be useful in other places in QC.

* Rename variable to avoid confusion.

* Fix bug affecting recursive grouping.

* Avoid copies in grouping code.

* Checkpoint work on extending functionality.

* Make plotter more powerful. Add colourization option. Begin simplifying interface.

* Allow user specification of colourmap.

* Add plotsize parameter.

* Fix #293 - OOB access caused by `output.subtract_directions`  (#294)

* Fix version drift.

* Bump to 0.2.0

* Fix #293.

* Namedbackups (#296)

* Fix version drift.

* Bump to 0.2.0

* Add optional label and single field selection to backup app

* remove item instead of pop@index

* do not .remove() from xds_list

* Simplify using some existing functionality.

---------

Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
Co-authored-by: landmanbester <lbester@ska.ac.za>

* Selectively disable MAD flagging criteria (#298)

* Fix version drift.

* Bump to 0.2.0

* Setting MAD threshold to zero will disable flagging on a given statistic.

* Disable mad flagging on off-diagonals by default (#300)

* Fix version drift.

* Bump to 0.2.0

* Disable flagging based on off-diagonal correlations in the mad flagger by default. This should make the mad flagger less agressive on data with unmodelled polarised emission.

* Fix bug affecting non-standard columns in `input_ms.data_column` (#301)

* Fix version drift.

* Bump to 0.2.0

* Fix a bug afecting the use of non-standard columns in data column input.

* Don't allow restore app to overwrite metadata (#307)

* assign to ms to avoid over-writing metadata in restore app

* zip datasets in enumerate

* add comment to document failure case

* use backup_column_name in restore app

* Apply OCD.

---------

Co-authored-by: landmanbester <lbester@ska.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>

* Fix for summary reporting SOURCE_ID as FIELD_ID (#309)

* Fix version drift.

* Bump to 0.2.0

* Make summary correctly report FIELD_ID and SOURCE_ID.

* Fix receptor summary (#310)

* Fix version drift.

* Bump to 0.2.0

* Fix incorrect assumption that FEED substable will always have 2 receptors.

* Fix similar problem affecting parallactic angle construction.

* Update missing column selection for compatibility with upsteam changes.

* Fix xarray dims (#318)

* Fix version drift.

* Bump to 0.2.0

* Move all usage of xds.dims[dim] to xds.sizes[dim] in preparation for change of return type in xds.dims.

* Fixes for changes relating to Numba error types. (#319)

* Move now-deprecated graph metrics function into the scheduler plugin code. (#320)

* Make small changes to enable 3.11 compatibilty. Requires changes in stimela + a release. (#321)

* Restringify keys in scheduler plugin. (#322)

* Attempt very dodgy solution to caching problem.

* Look for code in the correct place.

* Update pyproject.toml. Add poetry.lock. Update docs. (#323)

* Drop 3.8. Commit poetry lock file.

* Update stimela requirement.

* Update docs.

* Set min and max versions in pyproject.toml.

* Remove python3.8 from test matrix.

* Some debugging.

* Fix unsaved file.

* More debugging.

* Temporarily make test suite much smaller.

* Fix path.

* Actually fix path.

* Attempt at safer caching.

* More fiddling with paths.

* Fix bad tabbing.

* Try to find out where things are failing.

* More fiddling.

* More fiddling.

* More fiddling.

* Try restore time action.

* Tidy up caching approach. Use action. Restore matrix and test everything.

* Remove tmp file.

* Reword CI step name.

---------

Co-authored-by: JSKenyon <jonosken@gmail.com>
Co-authored-by: Landman Bester <lbester@sarao.ac.za>
Co-authored-by: JSKenyon <jonathan.simon.kenyon@gmail.com>
Co-authored-by: landmanbester <lbester@ska.ac.za>

* Bump dask-ms and codex-africanus dependencies. Update lock.

---------

Co-authored-by: Simon Perkins <simon.perkins@gmail.com>
Co-authored-by: Landman Bester <lbester@sarao.ac.za>
Co-authored-by: landmanbester <lbester@ska.ac.za>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants