Masking not working properly with _Unsigned #794

dopplershift · 2018-04-18T23:17:14Z

I have a Level 2 QPE product from GOES-16 that caused some support issues. The relevant CDL is:

netcdf satellite/goes16/GOES16/Products/RainRateQPE/FullDisk/current/OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc {
  dimensions:
    y = 5424;
    x = 5424;
    number_of_time_bounds = 2;
    band = 1;
    number_of_image_bounds = 2;
    number_of_sunglint_angle_bounds = 2;
    number_of_LZA_bounds = 2;
    number_of_SZA_bounds = 2;
    number_of_lat_bounds = 2;
    number_of_rainfall_rate_bounds = 2;
  variables:
    short RRQPE(y=5424, x=5424);
      :_FillValue = -1S; // short
      :long_name = "ABI L2+ Rainfall Rate - Quantitative Prediction Estimate";
      :standard_name = "rainfall_rate";
      :_Unsigned = "true";
      :valid_range = 0S, -6S; // short
      :scale_factor = 0.00152602f; // float
      :add_offset = 0.0f; // float
      :units = "mm h-1";
      :resolution = "y: 0.000056 rad x: 0.000056 rad";
      :coordinates = "latitude retrieval_local_zenith_angle quantitative_local_zenith_angle solar_zenith_angle t y x";
      :grid_mapping = "goes_imager_projection";
      :cell_methods = "latitude: point (good quality pixel produced) retrieval_local_zenith_angle: point (good or degraded quality pixel produced) quantitative_local_zenith_angle: sum (good quality pixel produced) solar_zenith_angle: sum (good quality pixel produced) t: point area: point";
      :ancillary_variables = "DQF";

Note the values in valid_range; the values themselves are appropriate for a signed data type, but they only make sense as a range if you convert signed (-6) to unsigned (65530). The values in valid_range are not incorrect though, as the standards specify that the values need to be the same type as the variable.

The current out of the box behavior is that netCDF4-python returns an entirely masked variable. The work-around is to disable masking.

The correct behavior IMO is to have valid_range and friends be handled like the data values for unsigned purposes.

I've included the sample file.

The text was updated successfully, but these errors were encountered:

jswhit · 2018-04-19T07:18:18Z

I'm traveling so I won't be able to look at this till next week. Have you tried the latest master?

jswhit · 2018-04-19T08:18:27Z

The valid range is assumed to be of the same type as the netcdf variable (signed short integer) and the conversion to unsigned short is considered to be part of the scale/offset operation (a numpy view is created after the mask is created).

dopplershift · 2018-04-19T21:54:20Z

In this case, valid_range is the same type as the variable. The problem is that valid_range is given as: (0, -6). These are the same (and correct) bit pattern regardless of signed/unsigned. The problem is that for the original signed data, masking values < 0 and >-6 produces useless results, whereas doing the same operation for the unsigned data, masking <0 and > 66530, produces the desired results.

jswhit · 2018-04-24T23:06:32Z

Yes, but isn't the valid_range (also missing_value, _FillValue) supposed to apply to the native variable data, which in this case is signed?

We are currently treating the _Unsigned attribute as part of the scaling operation, after the masking is applied.

dopplershift · 2018-04-26T03:09:19Z

Hmmm...I just found this in the netCDF User's Guide under Best Practices:

If the variable is unsigned the valid_range values should be widened if needed and stored as unsigned integers.

@lesserwhirls Does netCDF-java handle valid_range? If so, what does it do with _Unsigned combined with valid_range?

lesserwhirls · 2018-04-26T14:15:41Z

@dopplershift - yes, netCDF-java tries to deal with valid_range. I'm not sure of the details, as the code has changed between 4.6.x and 5.0. @cwardgar was in that code recently to deal with _FillValue, so he may have the best understanding at this point.

cwardgar · 2018-04-26T18:53:50Z

Does netCDF-java handle valid_range? If so, what does it do with _Unsigned combined with valid_range?

Yes it does. First, it widens valid_range to the next largest integral type. This allows a bit pattern which previously may have been interpreted as negative (because e.g. we're storing an unsigned short in a short) to be properly interpreted as a non-negative number.

Then, it applies scale and offset. The result will be a double. For the dataset you provided, NJ calculates valid_min == 0 and valid_max == 100.00009070616215. That seems correct, yeah?

jswhit · 2018-04-26T19:05:16Z

Does netCDF-java do the same with _FillValue and missing_value? (cast to the larger integral type)

cwardgar · 2018-04-26T19:48:47Z

@jswhit Yes, missing_value is widened first. _FillValue is not! That's likely a bug. Thanks for pointing that out.

And to be clear, valid_* and missing_value are widened before scale/offset are applied, not merely cast. For example:

        short s = -6;
        System.out.println((int) s);     // Cast: -6
        System.out.println(s & 0xffff);  // Widen: 65530

The problem that I see with _FillValue is that it is being cast (to double) before scale/offset right now, not widened.

jswhit · 2018-04-27T15:34:31Z

With the changes in pull request #797, the following script

from netCDF4 import Dataset
import matplotlib.pyplot as plt
nc=Dataset('OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc')
data = nc['RRQPE'][:]
print data.dtype, data.min(), data.max()
plt.imshow(data,cmap=plt.cm.jet,vmin=0,vmax=100)
plt.colorbar()
plt.show()

produces

float32 0.0 100.00009

and the attached png file.

Can someone try this with netcdf-java and see if they get the same?

lesserwhirls · 2018-04-27T22:11:02Z

That's what I get using toolsUI.

jswhit · 2018-05-01T15:50:51Z

pull request #797 merged

ghost · 2018-07-18T17:29:12Z

I have tried to output the valid_range of the dataset, and still got [0, -6]. Are they supposed to be [0, 100] or [0.0, 100.0] ?

from netCDF4 import Dataset

nc=Dataset('OR_ABI-L2-RRQPEF-M3_G16_s20181072300427_e20181072311194_c20181072311280.nc')
data = nc['RRQPE'][:]
print data.dtype, data.min(), data.max()
print nc['RRQPE'].getncattr('valid_range')

float32 0.0 100.00009

[ 0, -6]

dopplershift · 2018-07-25T00:11:14Z

The fix does not have the library change the valid_range attribute--it only fixed the automatic masking to use the proper data. IMO, changing attributes is outside the scope here.

jswhit added a commit that referenced this issue Apr 27, 2018

widen valid_min/valid_max is _Unsigned is set (issue #794)

435f3c4

jswhit mentioned this issue Apr 27, 2018

convert valid_min/valid_max/_FillValue to unsigned type if _Unsigned set #797

Merged

jswhit closed this as completed May 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Masking not working properly with _Unsigned #794

Masking not working properly with _Unsigned #794

dopplershift commented Apr 18, 2018

jswhit commented Apr 19, 2018

jswhit commented Apr 19, 2018

dopplershift commented Apr 19, 2018

jswhit commented Apr 24, 2018 •

edited

Loading

dopplershift commented Apr 26, 2018

lesserwhirls commented Apr 26, 2018

cwardgar commented Apr 26, 2018

jswhit commented Apr 26, 2018

cwardgar commented Apr 26, 2018

jswhit commented Apr 27, 2018 •

edited

Loading

lesserwhirls commented Apr 27, 2018

jswhit commented May 1, 2018

ghost commented Jul 18, 2018

dopplershift commented Jul 25, 2018

Masking not working properly with _Unsigned #794

Masking not working properly with _Unsigned #794

Comments

dopplershift commented Apr 18, 2018

jswhit commented Apr 19, 2018

jswhit commented Apr 19, 2018

dopplershift commented Apr 19, 2018

jswhit commented Apr 24, 2018 • edited Loading

dopplershift commented Apr 26, 2018

lesserwhirls commented Apr 26, 2018

cwardgar commented Apr 26, 2018

jswhit commented Apr 26, 2018

cwardgar commented Apr 26, 2018

jswhit commented Apr 27, 2018 • edited Loading

lesserwhirls commented Apr 27, 2018

jswhit commented May 1, 2018

ghost commented Jul 18, 2018

dopplershift commented Jul 25, 2018

jswhit commented Apr 24, 2018 •

edited

Loading

jswhit commented Apr 27, 2018 •

edited

Loading