-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
better detection of corrupted HDF5 file headers #405
Conversation
Ncdump is aborting on reading damaged HDF5 file. The problem was that the file magic number test code in both nc4file.c and dfile.c are not testing the full hdf5 magic header. They only look for the "XHDF" where X is any character. The test should be for the full eight characters of "\211HDF\r\n032\n". Also, to avoid inconsistency, I have removed the check code from nc4file.c and made it use NC_check_file_type in dfile.c Also added a test case.
search required by hdf5 files. It currently only looks at file position 0 instead of 512, 1024, 2048,... Also, it turns out that the HDF4 magic number is assumed to always be at the beginning of the file (unlike HDF5). Change is localized to libdispatch/dfile.c See https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf HatTip: cwardgar In any case clean up the NC_check_file_type code to correctly search for hdf5 magic number. Misc. other changes. 1. Clean up the NAN definition code in ncgen 2. add -N flag to ncgen to force the netcdfdataset name in the output. 3. reduce size of ncdump/test_corrupt_magic.nc 4. remove calls to nc_finalize in various test programs in nc_test4 5. nc_test/run_diskless2 was not using test_common
Further header changes: |
This tests that hdf5 files with an offset are properly recognized.
Hold this pull request until 4.5 is released |
Some conflicts, @DennisHeimbigner will resolve conflicts, merge into 4.5.0 release branch instead of |
Working through the pull requests on the branch http://cdash.unidata.ucar.edu/testDetails.php?test=66314&build=3580 |
The travis.ci tests aren't catching this and I'm not sure why. More work on my end to narrow down what's going on. I've observed it on OSX and Linux so far. |
Ok, somehow on my system, the file |
Ok, changing the shell to |
Bash is the right choice. Probably should check other .sh files to see if they |
Perhaps that test case should not try to construct the l64 file but rather |
that it used a pre-constructed binary file rather than using echo to create it on the fly. This should ensure that there are no extraneous newlines or such.
Modified the offset test case so |
@DennisHeimbigner Did you still want to take a look at these conflicts? |
yes. |
ok, conflicts resolved; not sure why so many spurious modifications, but oh well. |
This needs to be reopened and fixed, as it is causing issues with parallel tests, particularly when pnetcdf is enabled. I've reverted the merge locally and will push to github shortly. See #445 for more info. |
Ok. Reverting this merge doesn't negate the merge. I think I see how to get back to where we're at so this branch ( |
Ok. I'm going to issue a new pull request since this one is closed, but I won't merge until we get these parallel failures sorted out. |
re pull request #405 re pull request #446 Notes: 1. This branch is a cleanup of the magic.dmh branch. 2. magic.dmh was originally merged, but caused problems with parallel IO. It was re-issued as pull request #446. 3. This branch an pull request replace any previous pull requests and magic.dmh branch. Given an otherwise valid netCDF file that has a corrupted header, the netcdf library currently crashes. Instead, it should return NC_ENOTNC. Additionally, the NC_check_file_type code does not do the forward search required by hdf5 files. It currently only looks at file position 0 instead of 512, 1024, 2048,... Also, it turns out that the HDF4 magic number is assumed to always be at the beginning of the file (unlike HDF5). The change is localized to libdispatch/dfile.c See https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf Also, it turns out that the code in NC_check_file_type is duplicated (mostly) in the function libsrc4/nc4file.c#nc_check_for_hdf. This branch does the following. 1. Make NC_check_file_type return NC_ENOTNC instead of crashing. 2. Remove nc_check_for_hdf and centralize all file format checking NC_check_file_type. 3. Add proper forward search for HDF5 files (but not HDF4 files) to look for the magic number at offsets of 0, 512, 1024... 4. Add test tst_hdf5_offset.sh. This tests that hdf5 files with an offset are properly recognized. It does so by prefixing a legal file with some number of zero bytes: 512, 1024, etc. 5. Off-topic: Cleaned up handling of NAN and INFINITE in ncgen. 6. Off-topic: Added -N flag to ncdump to force a specific output dataset name.
re pull request #405 re pull request #446 Notes: 1. This branch is a cleanup of the magic.dmh branch. 2. magic.dmh was originally merged, but caused problems with parallel IO. It was re-issued as pull request #446. 3. This branch + pull request replace any previous pull requests and magic.dmh branch. Given an otherwise valid netCDF file that has a corrupted header, the netcdf library currently crashes. Instead, it should return NC_ENOTNC. Additionally, the NC_check_file_type code does not do the forward search required by hdf5 files. It currently only looks at file position 0 instead of 512, 1024, 2048,... Also, it turns out that the HDF4 magic number is assumed to always be at the beginning of the file (unlike HDF5). The change is localized to libdispatch/dfile.c See https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf Also, it turns out that the code in NC_check_file_type is duplicated (mostly) in the function libsrc4/nc4file.c#nc_check_for_hdf. This branch does the following. 1. Make NC_check_file_type return NC_ENOTNC instead of crashing. 2. Remove nc_check_for_hdf and centralize all file format checking NC_check_file_type. 3. Add proper forward search for HDF5 files (but not HDF4 files) to look for the magic number at offsets of 0, 512, 1024... 4. Add test tst_hdf5_offset.sh. This tests that hdf5 files with an offset are properly recognized. It does so by prefixing a legal file with some number of zero bytes: 512, 1024, etc. 5. Off-topic: Added -N flag to ncdump to force a specific output dataset name.
re e-support (UBS-599337)
Ncdump is aborting on reading damaged HDF5 file.
The problem was that the file magic number test code
in both nc4file.c and dfile.c are not testing the full
hdf5 magic header. They only look for the "XHDF"
where X is any character. The test should be for the full
eight characters of "\211HDF\r\n032\n".
Also, to avoid inconsistency, I have removed the check code from=
nc4file.c and made it use NC_check_file_type in dfile.c
Also added a test case.