Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Masking not working properly with _Unsigned #794

Closed
dopplershift opened this issue Apr 18, 2018 · 14 comments
Closed

Masking not working properly with _Unsigned #794

dopplershift opened this issue Apr 18, 2018 · 14 comments

Comments

@dopplershift
Copy link
Member

I have a Level 2 QPE product from GOES-16 that caused some support issues. The relevant CDL is:

netcdf satellite/goes16/GOES16/Products/RainRateQPE/FullDisk/current/OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc {
  dimensions:
    y = 5424;
    x = 5424;
    number_of_time_bounds = 2;
    band = 1;
    number_of_image_bounds = 2;
    number_of_sunglint_angle_bounds = 2;
    number_of_LZA_bounds = 2;
    number_of_SZA_bounds = 2;
    number_of_lat_bounds = 2;
    number_of_rainfall_rate_bounds = 2;
  variables:
    short RRQPE(y=5424, x=5424);
      :_FillValue = -1S; // short
      :long_name = "ABI L2+ Rainfall Rate - Quantitative Prediction Estimate";
      :standard_name = "rainfall_rate";
      :_Unsigned = "true";
      :valid_range = 0S, -6S; // short
      :scale_factor = 0.00152602f; // float
      :add_offset = 0.0f; // float
      :units = "mm h-1";
      :resolution = "y: 0.000056 rad x: 0.000056 rad";
      :coordinates = "latitude retrieval_local_zenith_angle quantitative_local_zenith_angle solar_zenith_angle t y x";
      :grid_mapping = "goes_imager_projection";
      :cell_methods = "latitude: point (good quality pixel produced) retrieval_local_zenith_angle: point (good or degraded quality pixel produced) quantitative_local_zenith_angle: sum (good quality pixel produced) solar_zenith_angle: sum (good quality pixel produced) t: point area: point";
      :ancillary_variables = "DQF";

Note the values in valid_range; the values themselves are appropriate for a signed data type, but they only make sense as a range if you convert signed (-6) to unsigned (65530). The values in valid_range are not incorrect though, as the standards specify that the values need to be the same type as the variable.

The current out of the box behavior is that netCDF4-python returns an entirely masked variable. The work-around is to disable masking.

The correct behavior IMO is to have valid_range and friends be handled like the data values for unsigned purposes.

I've included the sample file.

@jswhit
Copy link
Collaborator

jswhit commented Apr 19, 2018

I'm traveling so I won't be able to look at this till next week. Have you tried the latest master?

@jswhit
Copy link
Collaborator

jswhit commented Apr 19, 2018

The valid range is assumed to be of the same type as the netcdf variable (signed short integer) and the conversion to unsigned short is considered to be part of the scale/offset operation (a numpy view is created after the mask is created).

@dopplershift
Copy link
Member Author

In this case, valid_range is the same type as the variable. The problem is that valid_range is given as: (0, -6). These are the same (and correct) bit pattern regardless of signed/unsigned. The problem is that for the original signed data, masking values < 0 and >-6 produces useless results, whereas doing the same operation for the unsigned data, masking <0 and > 66530, produces the desired results.

@jswhit
Copy link
Collaborator

jswhit commented Apr 24, 2018

Yes, but isn't the valid_range (also missing_value, _FillValue) supposed to apply to the native variable data, which in this case is signed?

We are currently treating the _Unsigned attribute as part of the scaling operation, after the masking is applied.

@dopplershift
Copy link
Member Author

Hmmm...I just found this in the netCDF User's Guide under Best Practices:

If the variable is unsigned the valid_range values should be widened if needed and stored as unsigned integers.

@lesserwhirls Does netCDF-java handle valid_range? If so, what does it do with _Unsigned combined with valid_range?

@lesserwhirls
Copy link
Collaborator

@dopplershift - yes, netCDF-java tries to deal with valid_range. I'm not sure of the details, as the code has changed between 4.6.x and 5.0. @cwardgar was in that code recently to deal with _FillValue, so he may have the best understanding at this point.

@cwardgar
Copy link

Does netCDF-java handle valid_range? If so, what does it do with _Unsigned combined with valid_range?

Yes it does. First, it widens valid_range to the next largest integral type. This allows a bit pattern which previously may have been interpreted as negative (because e.g. we're storing an unsigned short in a short) to be properly interpreted as a non-negative number.

Then, it applies scale and offset. The result will be a double. For the dataset you provided, NJ calculates valid_min == 0 and valid_max == 100.00009070616215. That seems correct, yeah?

@jswhit
Copy link
Collaborator

jswhit commented Apr 26, 2018

Does netCDF-java do the same with _FillValue and missing_value? (cast to the larger integral type)

@cwardgar
Copy link

@jswhit Yes, missing_value is widened first. _FillValue is not! That's likely a bug. Thanks for pointing that out.

And to be clear, valid_* and missing_value are widened before scale/offset are applied, not merely cast. For example:

        short s = -6;
        System.out.println((int) s);     // Cast: -6
        System.out.println(s & 0xffff);  // Widen: 65530

The problem that I see with _FillValue is that it is being cast (to double) before scale/offset right now, not widened.

@jswhit
Copy link
Collaborator

jswhit commented Apr 27, 2018

With the changes in pull request #797, the following script

from netCDF4 import Dataset
import matplotlib.pyplot as plt
nc=Dataset('OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc')
data = nc['RRQPE'][:]
print data.dtype, data.min(), data.max()
plt.imshow(data,cmap=plt.cm.jet,vmin=0,vmax=100)
plt.colorbar()
plt.show()

produces

float32 0.0 100.00009

and the attached png file.

Can someone try this with netcdf-java and see if they get the same?

issue794

@lesserwhirls
Copy link
Collaborator

That's what I get using toolsUI.

@jswhit
Copy link
Collaborator

jswhit commented May 1, 2018

pull request #797 merged

@jswhit jswhit closed this as completed May 1, 2018
@ghost
Copy link

ghost commented Jul 18, 2018

I have tried to output the valid_range of the dataset, and still got [0, -6]. Are they supposed to be [0, 100] or [0.0, 100.0] ?

from netCDF4 import Dataset

nc=Dataset('OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc')
data = nc['RRQPE'][:]
print data.dtype, data.min(), data.max()
print nc['RRQPE'].getncattr('valid_range')

float32 0.0 100.00009

[ 0, -6]

@dopplershift
Copy link
Member Author

The fix does not have the library change the valid_range attribute--it only fixed the automatic masking to use the proper data. IMO, changing attributes is outside the scope here.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants