-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
ARM performance library comparisons #4744
Comments
I don't have data for that right now , best you do the comparison on the hardware and functions you want to use. I expect performance will be pretty similar for GEMM, but OpenBLAS does not yet have SVE kernels for every function where it would make sense (e.g. complex dot product has fairly poor performance) |
I will probably run some test on M1 chips since both can be easily installed on my Mac. What other functions you suggest I can test except GEMM? Thanks, Jianshu |
I'd think DOT and AXPY, as a number of other BLAS functions can/will be implemented in terms of them. Maybe TRMM for completeness (will not necessarily be close to GEMM). Of course M1 is a bit special as it does not provide SVE support - on the other hand, if you (can) include Apple's own Accelerate library in your comparison you gain access to the (officially) undocumented matrix math coprocessor (AMX). |
to be followed up at some point in the future in OpenMathLib/BLAS-Benchmarks#8 |
Dear OpenBLAS team,
Just curious how OpenBLAS on ARM will look like when compared to the ARM official performance library here, on ARM CPUs: https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries. It is only for ARM CPUs though.
Thanks,
Jianshu
The text was updated successfully, but these errors were encountered: