-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
💪 handle NaNs #16
Comments
What I tried in PR #18 Transform floats to integer dtypes & perform comparisons in that data spaceHow: use same projection (bitwise transformation & transmutation) as we did for f16, see #1 The projection
⬇️ illustration of how this projection (mapping
Other really nice to have features:
TODOs
Some thoughts
|
@varon - I just updated this Issue (added a description + explained in the comment above what I tried in PR #18) If you can find the time, I would love to hear your feedback on this! :) Some futher (optional) questions that would be interesting to discuss:
|
We want to properly handle NaNs.
What are NaNs?
NaNs are only present in float datatypes - and thus do not exist in (unsigend) integer datatypes.
For simplicity I will illustrate things for 16-bit numbers.
Some background ⬇️
Int16 representation
https://steffanynaranjo.medium.com/how-integers-are-stored-in-memory-using-twos-complement-8b4237f7426dFloat16 representation
!! We are dealing with
fp16
here, NOTbfloat16
nans occur when exponent is all 1s and mantissa is not all 0s. (sign does not matter)
How do infinities look like?
+inf occurs when sign is 0, exp is all 1s, and mantissa is all 0s.-inf occurs when sign is 1, exp is all 1s, and mantissa is all 0s.
How does the current implementation cope with NaNs?
Necessary background info: every comparison (
gt
,lt
,eq
) with NaNs results infalse
⬇️=> Current implementation deals as follows with NaNs in the data:
false
and thus no NaNs will be added in the accumulating SIMD vector)np.nanargmax
/np.nanargmin
behaviorfalse
and thus the NaNs will never be updated).TODOs
NaN
values in the first SIMD vector -> if we can create/ensure a "NaN-free" initial SIMD vec, than no NaNs will be added in the inner SIMD loop => we have an implementation ofnp.nanargmin
/np.nanargmax
.How should we handle NaNs
Ideally we support, just like numpy, two variants of the
argmim
/argmax
algorithm:-> corresponds to
np.nanargmin
/np.nanargmax
-> corresponds to
np.argmin
/np.argmax
we can serve both functionalities by adding a new function to the
ArgMinMax
trait and let it default for non-float datatypes to the current implementation.The text was updated successfully, but these errors were encountered: