-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
🚧 POC - support NaNs for SSE & AVX2 f32 #18
Conversation
CodSpeed Performance ReportMerging #18 Summary
Benchmarks breakdown
|
@varon I just wrote out very quickly what I pursued in this PR (is more a POC than a proper PR). P.S.: I'll improve the phrasing tomorrow - just wanted to quickly push & share the code 🙃 |
When I run the benchmarks on my local machine I notice only a 3-4% regression 🤔
|
Is this superseded by #21 ? |
Yes! This was some sort of a proof of concept, showcasing the utility of |
This PR aims at handling Nans, #16.
For the scalar implementation I check whether the scalar value is not equal to itself. This check will only be true if the scalar value is NAN, as the following is correct in Rust:
For the SIMD implementation I used a transformation similar to the one used in #1 - this transformation projects the NANs to integer values that are either higher / lower than the "real" floating point values. The transformation leverages the 2-complement https://observablehq.com/@rreusser/half-precision-floating-point-visualized
Some remarks:
+ inf
&- inf
will get projected as well=> this is indeed the case - see plot ⬇️
11111
and the fraction should be non-zero. Thus the sign bit may be 1 or 0 -> resulting in half of the NaNs getting projected above and the other half below the "real" floating point values. Thus, only 1 of the 2 checks (either > or <) will fire & thus detect theNaN
. This might be a problem when we want to implementargmin
andargmax
as separate functions..Paths I looked into but did not seem worthwhile: