-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Current issues renaming dimensions and coordinates in netCDF4 files #597
Comments
Thanks @czender we were flooded with other PR's and I'm trying to manage
those combined with the other open issues, we want to get the next bugfix
release out shortly, so trying to address the issues you have open right
now (in addition to working on some of the PR's Dennis has for compression,
etc).
…On Mon, Nov 6, 2017 at 8:52 PM, Charlie Zender ***@***.***> wrote:
Old issues with renaming dimensions and coordinates in netCDF4 files are
beginning to affect CMIP6 processing so here's a reminder. All recent
library versions are affected. See here
<http://nco.sf.net/nco.html#bug_nc4_rename> for a chronology. Create a
netCDF4 file with a coordinate:
netcdf bug_rnm {
dimensions:
lon = 4 ;
variables:
float lon(lon) ;
data:
lon = 0, 90, 180, 270 ;
}
Remember to re-create the file after each of the three tests below.
Renaming both dimension and coordinate together works. Yay! This surprises
me, given the history recounted above:
ncrename -d lon,longitude -v lon,longitude ~/bug_rnm.nc # works
Renaming dimension then coordinate separately fails:
ncrename -d lon,longitude ~/bug_rnm.nc
ncrename -v lon,longitude ~/bug_rnm.nc # borken "HDF error"
Renaming dimension then coordinate separately fails:
ncrename -v lon,longitude ~/bug_rnm.nc
ncrename -d lon,longitude ~/bug_rnm.nc # borken "nc4_reform_coord_var: Assertion `dim_datasetid > 0' failed."
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#597>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEH-UmNZUGFrWoYswzAw4jyP-JQVCt1Cks5sz9PzgaJpZM4QUOl->
.
|
Howdy @czender ! I found a test written by Quincey that deals with renaming, nc_test4/tst_rename.c. I have add to the existing tests, and reproduced your error in C code. Here's the test:
This fails like this:
I will take a look and see what is going on here... |
OK, the problem can be seen from the h5dump:
When the code hits this line in NC4_rename_var() it fails:
So the code is trying to rename the dataset with H5Gmove, but can't because there is already a dataset with the new name. It was created to support the dimension name change - its one of those dreaded secret datasets to support HDF5 dimension scales. I guess the way to proceed is to check before the H5Gmove to see if there is another var of the desired name which secretly exists, and then delete it, and then make the newly renamed dataset into a dimension scale. I will try to take a stab at that... |
Thanks, @edhartnett. It sounds like an intricate fix, so I hope you #persist. |
Thanks, All! |
This is happening on this branch: https://github.com/NetCDF-World-Domination-Council/netcdf-c/tree/ejh_zender_coordinate_var_rename So far I have just added the new test that demonstrates the problem. |
@czender I am working on this now as part of my current branch. I have been sneaking up on this by writing tests for lots of simpler code in nc4var.c. Now I have done the easy stuff and am moving on to the rename bugs. To help us understand the urgency of this fix, if any, can you elaborate on how it is being used in CMIP6? And what CMIP6 is? Are we talking world-shatteringly important climate research here? Or what? And why the heck are you renaming vars anyway? Can't you scientists just make up your minds? What is the use-case that is driving this? Can you help me understand the importance of this feature? Thanks! PS I am going to fix it anyway, obviously. You would have to physically restrain me to stop me from fixing it. I am just curious why you need it. |
CMIP6 is the large international climate model intercomparison for the next IPCC report. The CMIP6 organizers provided data in netcdf files for all of the different modeling groups to use. However, each climate model typically looks for a particular name for each input variable it needs, which varies from model to model (eg, sea_surface_temperature, SST, surfTemp), and there are hundreds of these, especially when we have to reprocess our output to match the CMIP6 required variable names. Hence the need to rename the variables. The good news is that I have already worked around this bug (it was too important to wait), but it will be good to get it fixed for the future. |
I actually use the NCO utilities, so this is a question for @czender . |
Ed,In retrospect, I sure wih you guys had not been seduced |
Mea culpa, mea culpa, mea maxima culpa. |
@cameronsmith1 thanks for the explanation. What workaround are you using? |
I ended up using the nco utilities of @czender in a different way, so I am not sure how to express it in native HDF commands. |
I receive a few bug reports a year from people trying to rename coordinates in netCDF4 files with ncrename. CMIP does this on an industrial scale. Most people end up using the multistep workaround(s) documented in the NCO manual here. |
OK I have a fix for this problem, as well as several other rename problems. I am still working on another one I found. There is a bunch of state code in this which is not helpful. For example there are these values in NC_VAR_INFO_T:
These are set at various times in nc_rename_var() and nc_rename_dim(). Then in the sync that is called during nc_enddef() these values are consulted and various actions taken. Problem is, it's not hard to come up with combinations of rename calls which confuse the crap out of the code. There are so many possible ways that users can name and rename vars and dims, with enddefs/redefs thrown in or not, It gets very, very confusing. What would work better would be to assume a statelessness. Whenever a rename command is complete, the file on disk and in metadata must be completely adjusted and in a complete state. There must be nothing left over of the rename process for enddef to do. All work must be done in nc_rename_var() and nc_rename_dim(). If that were true, then the user could call them in any wild order, and it would not matter to us. I am not attempting to take these out right now. I am patching the existing code so that it works for the test cases that we've identified, including Charlie's. But I suspect there are many more bugs out there with the existing code that will only be eliminated by removing these state variables and dealing with everything in the rename. |
OK, these fixes are up as PR #755. It's branch ejh_fill_values on my netcdf clone: git@github.com:NetCDF-World-Domination-Council/netcdf-c.git @czender if you could grab the branch and give it a try, that would be great. I don't think I've fixed every possible rename problem, but see if your common cases now work OK. Any problems would be best expressed by adding a test to nc_test4/tst_rename.c. Simply copy an existing test case, then change it to make it fail, and send me the code or post it here. |
Thanks for working on this. At the web-page that @czender mentioned, there is a long list of problems over the years with renaming variables, which supports your conclusion that the current implementation is fragile against the many possible renaming situations. Interestingly, @czender notes that a robust solution is to convert from netcdf4 to netcdf3, then rename and convert back to netcdf4. Does the implementation of rename for netcdf3 offer a useful template? |
Thanks, @edhartnett. I will try to test this soon and report the results here. |
@czender you should actually use branch currently up as PR #763. This PR contains just the rename fixes from the other PR, now closed without merging. In tst_rename.c there are some commented-out lines which show an unfixed bug:
Still working on this one. ;-) |
I got as far as re-building and testing the latest netCDF master four days ago. It produced a total of 7 new and unexpected failures with the NCO regression test, half with ncks and half with ncpdq. All new failures involved netCDF4 files. I have not yet figured out the exact cause. Tracking that down has higher priority than testing the rename fix for now. |
Well it might be a good idea for us to use the NCO tests as another set of
testing for netCDF...
…On Tue, Jan 9, 2018 at 12:27 PM, Charlie Zender ***@***.***> wrote:
I got as far as re-building and testing the latest netCDF master four days
ago. It produced a total of 7 new and unexpected failures with the NCO
regression test, half with ncks and half with ncpdq. All new failures
involved netCDF4 files. I have not yet figured out the exact cause.
Tracking that down has higher priority than testing the rename fix for now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#597 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEUr3EJohcZQc3k1PnO5TtXCfr2bvTYuks5tI72igaJpZM4QUOl->
.
|
NCO is extremely widely used in the climate community for working with netcdf files, so including them in your test suite would be wonderful. |
We already do as part of our testing. Unfortunately there are two sets of tests: ‘make check’ which will return non-zero when there is a failure, and ‘make test’ which requires manual intervention to catch failures. We run the firmer but not the latter. |
@czender are you seeing failures in make check or make test? I am not seeing failures with make check. I will check with make test after my poster session. |
As you suspect it is with "make test", which is always what I mean by "the NCO regression test". For historical reasons, NCO's "make check" is basically just a check that the NCO C++ code library links and runs a single, simple program. |
After a little investigation it turns out that this was a problem with "make test" relying on a file, hdn.nc, that autoconf apparently does not re-create, it is not a netCDF or NCO code issue. Phew. Now I can move on to testing the rename patch... |
Excellent! I will hold off, let me know if any regressions pop up. Thank you @czender! |
Yes |
Hi @edhartnett, I wanted to share a use case where this bug is impeding some of our Vapor users. CM1 is a model for convective systems, particularly hurricanes, tornadoes, and supercells. It has been growing in popularity recently, and our team is seeing an increasing number of requests on how to ingest CM1 data into Vapor. Vapor only supports NetCDF data that follows the CF Conventions. Unfortunately, CM1 outputs its coordinate variables with different names than their corresponding dimensions. Some users are determined enough to get around the bug, but we also see some give up. If renaming coordinate variables worked as intended, it would reduce our efforts in supporting CM1 users, and also increase our user base. In my opinion, it would also help to produce more compelling science. Thank you. |
Thanks for sharing at use case! Can you not get the CM1 model team to change their output to properly use coordinate vars? I would really like to rewrite the rename code, and use enddef/redef after each rename. I believe this could eliminate all the bugs associated with renames, which would be great. However, time, time, time... |
Fair point. I'll reach out to them.
I understand. We're in the same boat. I just wanted to raise this point in hopes it would raise the priority. |
FYI, I have patched the latest version of CM1 to output VAPOR3 compatible files: https://github.com/leighorf/cm1r19.10 |
All, I just want to make sure that a bug a user is seeing with $ ncdump -v lev merra2.airs_aqua.mean3d.201810.nc4
netcdf merra2.airs_aqua.mean3d.201810 {
dimensions:
lon = 576 ;
lat = 361 ;
lev = 117 ;
time = UNLIMITED ; // (1 currently)
variables:
double lon(lon) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
double lat(lat) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
double lev(lev) ;
lev:long_name = "vertical level" ;
lev:units = "hPa" ;
lev:positive = "down" ;
...
:LongitudeResolution = "0.625" ;
:DataResolution = "0.5 x 0.625" ;
:identifier_product_doi_authority = "http://dx.doi.org" ;
:identifier_product_doi = "TBD" ;
data:
lev = 229, 228, 227, 226, 225, 223, 222, 221, 220, 219, 218, 217, 216, 215,
212, 208, 202, 193, 190, 186, 182, 181, 178, 177, 176, 174, 173, 172,
171, 170, 169, 168, 167, 166, 130, 129, 128, 126, 125, 124, 123, 122,
121, 119, 117, 116, 115, 114, 113, 110, 109, 107, 106, 105, 104, 103,
102, 101, 100, 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87, 86,
85, 84, 83, 81, 80, 78, 77, 76, 75, 73, 72, 70, 68, 67, 66, 65, 64, 63,
62, 61, 60, 59, 57, 56, 55, 53, 52, 50, 49, 47, 46, 45, 44, 31, 29, 21,
18, 14, 13, 11, 10, 9, 6, 3 ;
}
$ ncrename --history -v lev,levels -d lev,levels merra2.airs_aqua.mean3d.201810.nc4 foo.nc4
$ ncdump -v levels foo.nc4
netcdf foo {
dimensions:
lon = 576 ;
lat = 361 ;
levels = 117 ;
time = UNLIMITED ; // (1 currently)
variables:
double lon(lon) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
...
double levels(levels) ;
levels:long_name = "vertical level" ;
levels:units = "hPa" ;
levels:positive = "down" ;
// global attributes:
...
:WorthernmostLatitude = "-180.0" ;
:EasternmostLatitude = "179.375" ;
:LatitudeResolution = "0.5" ;
:LongitudeResolution = "0.625" ;
:DataResolution = "0.5 x 0.625" ;
:identifier_product_doi_authority = "http://dx.doi.org" ;
:identifier_product_doi = "TBD" ;
data:
levels = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _ ;
} I mean, it does rename Obviously there's an issue, but not sure what. Now it does look like the "usual workaround" of: $ ncks -h -3 merra2.airs_aqua.mean3d.201810.nc4 tmp.nc
$ ncrename --history -d lev,levels -v lev,levels tmp.nc tmp_rename.nc
$ ncks -h -4 tmp_rename.nc newfile.nc4 does work...though (weirdly?) the compression in the original file is lost, so you have to do:
but it does seem to at least seem to preserve what's in $ ncdump -v levels newfile.nc4 | tail -15
:LongitudeResolution = "0.625" ;
:DataResolution = "0.5 x 0.625" ;
:identifier_product_doi_authority = "http://dx.doi.org" ;
:identifier_product_doi = "TBD" ;
data:
levels = 229, 228, 227, 226, 225, 223, 222, 221, 220, 219, 218, 217, 216,
215, 212, 208, 202, 193, 190, 186, 182, 181, 178, 177, 176, 174, 173,
172, 171, 170, 169, 168, 167, 166, 130, 129, 128, 126, 125, 124, 123,
122, 121, 119, 117, 116, 115, 114, 113, 110, 109, 107, 106, 105, 104,
103, 102, 101, 100, 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87,
86, 85, 84, 83, 81, 80, 78, 77, 76, 75, 73, 72, 70, 68, 67, 66, 65, 64,
63, 62, 61, 60, 59, 57, 56, 55, 53, 52, 50, 49, 47, 46, 45, 44, 31, 29,
21, 18, 14, 13, 11, 10, 9, 6, 3 ;
} |
Hi Matt, |
@czender I tried that but: $ ncrename --history -v lev,levels merra2.airs_aqua.mean3d.201810.nc4 yaya.var.nc4
$ ncdump -v levels yaya.var.nc4 | tail -n 15
:LongitudeResolution = "0.625" ;
:DataResolution = "0.5 x 0.625" ;
:identifier_product_doi_authority = "http://dx.doi.org" ;
:identifier_product_doi = "TBD" ;
data:
levels = 229, 228, 227, 226, 225, 223, 222, 221, 220, 219, 218, 217, 216,
215, 212, 208, 202, 193, 190, 186, 182, 181, 178, 177, 176, 174, 173,
172, 171, 170, 169, 168, 167, 166, 130, 129, 128, 126, 125, 124, 123,
122, 121, 119, 117, 116, 115, 114, 113, 110, 109, 107, 106, 105, 104,
103, 102, 101, 100, 99, 98, 97, 96, 95, 94, 93, 92, 91, 90, 89, 88, 87,
86, 85, 84, 83, 81, 80, 78, 77, 76, 75, 73, 72, 70, 68, 67, 66, 65, 64,
63, 62, 61, 60, 59, 57, 56, 55, 53, 52, 50, 49, 47, 46, 45, 44, 31, 29,
21, 18, 14, 13, 11, 10, 9, 6, 3 ;
}
$ ncrename --history -d lev,levels yaya.var.nc4 yaya.var_and_dim.nc4
$ ncdump -v levels yaya.var_and_dim.nc4 | tail -n 15
:WorthernmostLatitude = "-180.0" ;
:EasternmostLatitude = "179.375" ;
:LatitudeResolution = "0.5" ;
:LongitudeResolution = "0.625" ;
:DataResolution = "0.5 x 0.625" ;
:identifier_product_doi_authority = "http://dx.doi.org" ;
:identifier_product_doi = "TBD" ;
data:
levels = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _ ;
} That |
Ouch. No, you tried what I intended (because it sometimes works). I'd stick with the original "convert to netCDF3 (with -3, -5, or -6) first" workaround :) |
I might have ran into this issue in netcdf4-python, documented here: Unidata/netcdf4-python#1357 Therefore, I was wondering whether it is likely that this issue will be solved any time soon. I noticed this is an issue from 2017 already, so I guess this might not happen ever. Just looking for some expectation management. |
Is this a connected issue - current version of ncrename converts x dimension to a y on a rename? edit: occurs in nco/5.0.6 but not nco/4.7.5 let me know if I should move this comment to an issue on a different repo?
then gives me from:
to
|
Yes, almost certainly this is due to the rename bug in libnetcdf. I suggest you use the standard workaround of converting to netCDF3, renaming, then converting back to netCDF4. |
@czender ok, thanks. This issue has been sitting here since 2017. Do you know if there is any chance of this ever being fixable? I don't know anything about the future of NetCDF-4 but if necessary I can try and migrate away from netcdf for these sort of files to another format when we need to do a lot of variable renaming? |
I do not speak for netCDF developers. Personally I wouldn't count on this problem, which is quite thorny, being fixed in the short term. I use CDF5 format for most things and renaming always works, and I convert to/from netCDF4 when I need its compression capabilities. |
@czender fair enough! In my experience people are not always aware if they are performing operations on a nc3 or nc4 file - would it be possible to make the ncrename tool either abort with an error when attempting to ncrename an nc4 file, or even internally convert to nc3, do the rename, and convert back? It seems the errors with an ncrename of a netcdf4 file are not obvious (such as the above) and can be tricky to debug? |
Ed- did you ever complete your changes for this problem? |
Perhaps it is time to bite the bullet and get rid of dimension scales when writing new files. |
From the forthcoming NCO 4.3.3 documentation:
Feedback welcome! |
@czender thanks that is looking great! Does the warning appear only if the input file is netcdf 4? Is it possible for me to flag that i want warnings to be errors? Otherwise i can just check |
Yes
No |
Great thanks all. Well for what its worth, you would get plenty of votes for a solution to this issue from the Met Office - at the moment ncrename issues are the main reason users have for avoiding netcdf-4 entirely. |
Another quick note on this, I would also be in favour of you entirely removing the rename function for netcdf 4 files from the library if it doesn't work - this kind of silent bug can really be a headache for people! |
Old issues with renaming dimensions and coordinates in netCDF4 files are beginning to affect CMIP6 processing so here's a reminder. All recent library versions are affected. See here for a chronology. Create a netCDF4 file with a coordinate:
Remember to re-create the file after each of the three tests below. Renaming both dimension and coordinate together works. Yay! This surprises me, given the history recounted above:
Renaming dimension then coordinate separately fails:
Renaming coordinate then dimension separately fails:
The text was updated successfully, but these errors were encountered: