-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Added more descriptive error message if indexing with all False bool … #1197
Conversation
Seems like an empty array should be returned, instead of raising an error. |
My experience has been that 10 times/10, doing this is erroneous, since loading nothing from a file is nonsensical. I would rather it fail fast than silently, since on the rare occasion one does want to load nothing from a file, they can exception handle the error. |
That's OK with me, I only say that because that's why numpy does and it might be what many expect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like elem can still be an iterable here, in that case the individual elements of elem need to be checked to see if they are all False boolean areas.
We are already checking for boolean index arrays (in order to convert them to integer index arrays), so how about this: diff --git a/src/netCDF4/utils.py b/src/netCDF4/utils.py
index c96cc757..dcfeca85 100644
--- a/src/netCDF4/utils.py
+++ b/src/netCDF4/utils.py
@@ -238,6 +238,10 @@ def _StartCountStride(elem, shape, dimensions=None, grp=None, datashape=None,\
unlim = False
# convert boolean index to integer array.
if np.iterable(ea) and ea.dtype.kind =='b':
+ # check that boolean array is not all False.
+ if not ea.any():
+ msg='Boolean index array is all False, at least one element must be True'
+ raise IndexError(msg)
# check that boolean array not too long
if not unlim and shape[i] != len(ea):
msg=""" Also, please add a test and a Changelog entry |
Tests are failing when an all False boolean index array is used on assignment, which should do nothing (but not raise an exception). The fix is to only do the check when |
this should fix it diff --git a/src/netCDF4/utils.py b/src/netCDF4/utils.py
index c96cc757..dcfeca85 100644
--- a/src/netCDF4/utils.py
+++ b/src/netCDF4/utils.py
@@ -238,6 +238,10 @@ def _StartCountStride(elem, shape, dimensions=None, grp=None, datashape=None,\
unlim = False
# convert boolean index to integer array.
if np.iterable(ea) and ea.dtype.kind =='b':
+ # check that boolean array is not all False when reading.
+ if not put and not ea.any():
+ msg='Boolean index array is all False, at least one element must be True'
+ raise IndexError(msg)
# check that boolean array not too long
if not unlim and shape[i] != len(ea):
msg=""" |
I discovered that slicing a 1d variable with an all False boolean index array does return an empty array (consistent with numpy). An exception is raised on when slicing multi-dimensional arrays. PR #1198 ensures that empty arrays are always returned. Given that this is the current behavior for 1d vars, and that's what numpy does, I think this is the preferred solution. |
There should at least be a warning. |
I am not convinced - curious what others think. Seems to me if you slice with an all False boolean array, you are getting back exactly what you asked for (an empty array). |
Plus the fact that this is what numpy does is a pretty big precendent. |
Sorry I'm so late returning to this, although it seems I'm too late. I think we were talking about different problems. I wasn't so much concerned with whether an error should occur or something should be returned. My concern was that enough information was provided to users (both developers and application users) that potential errors aren't obscured and made difficult to diagnose. The boolean array can be determined programmatically, and users may not know exactly what they are asking for or even what's in the .nc file (e.g. when iterating over many files). In this case, a returned empty array would be propagated to far removed parts of user code. A warning would help diagnose indexing errors on that empty array that would eventually occur, and the fundamental behavior would still be consistent with numpy. If users want behavior like this to be erroneous, they can elevate warnings to errors themselves. So it's the best of both worlds. Lastly, I don't think the warning would be a nuisance to anyone, since until now it had been an error. In fact, returning an empty array silently may break exception handling code for people that relied on this being an error in the past. |
If one tries to load data using an all-False bool array, it returns an unhelpful error message that doesn't quite describe the problem.
This pull request catches the case where an invalid all-False bool array is used to load data and adds a more descriptive error message.
Cheers,
Tam