Add SBGEMM for arm neoversev1 #5108

taoye9 · 2025-02-05T14:50:29Z

This PR is to add an optimised bf16 gemm kernel for arm neoversev1 machine (sve-256 bit).

Signed-off-by: Ye Tao <ye.tao@arm.com>

taoye9 · 2025-02-07T10:28:16Z

@martin-frbg hi, martin, seems my patch fails on a irrelevant mips64 ci pipeline and after printing out TEST 109/111 kernel_regress:skx_avx [OK] due to exceeding maximum execution time. Could you help me on understanding whether this is expected or my changes accidentally cause performance issues?

it seems some previous pr also fails after this kernel_regress:skx_avx test.

martin-frbg · 2025-02-07T19:30:36Z

Thanks - no problem with your PR, just me lacking the time and energy to confirm and merge it

aditew01 · 2025-02-11T11:15:27Z

@martin-frbg Do we have any release cadence to upgrade the version to 0.30.0?

martin-frbg · 2025-02-11T12:35:43Z

@aditew01 0.3.30 is planned for end of the month (please see Milestones)

- Updates changelog - Removes pytorch/pytorch#139387 - Add prepacking for linear weights. Performance gains better realised by ideep reorder caching. - Updates OpenBLAS build to use recent commit from develop which includes: OpenMathLib/OpenBLAS#5108

aditew01 · 2025-02-11T14:00:37Z

@martin-frbg that sounds great! Thanks for the clarification. Missed the milestone update, but I'd keep a look out. :)

- Updates changelog - Removes pytorch/pytorch#139387 - Add prepacking for linear weights. Performance gains better realised by ideep reorder caching. - Updates OpenBLAS build to use recent commit from develop which includes: OpenMathLib/OpenBLAS#5108

aditew01 and others added 2 commits February 3, 2025 12:49

* checkpoint sbgemm for SVE-256

4379a6f

optimized sbgemm kernel for neoverse-v1 (sve-256)

c748e6a

Signed-off-by: Ye Tao <ye.tao@arm.com>

taoye9 marked this pull request as draft February 5, 2025 14:53

taoye9 marked this pull request as ready for review February 6, 2025 10:48

martin-frbg added this to the 0.3.30 milestone Feb 7, 2025

martin-frbg merged commit 1b85b6a into OpenMathLib:develop Feb 7, 2025
85 of 86 checks passed

nSircombe mentioned this pull request Feb 11, 2025

Updates for 25.02 release ARM-software/Tool-Solutions#290

Merged

taoye9 mentioned this pull request Feb 13, 2025

Fix Numeric Error in SBGEMM Kernel for NEOVERSEV2 when DYNAMIC_ARCH=1 #5129

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SBGEMM for arm neoversev1 #5108

Add SBGEMM for arm neoversev1 #5108

taoye9 commented Feb 5, 2025 •

edited

Loading

taoye9 commented Feb 7, 2025 •

edited

Loading

martin-frbg commented Feb 7, 2025

aditew01 commented Feb 11, 2025

martin-frbg commented Feb 11, 2025

aditew01 commented Feb 11, 2025

Add SBGEMM for arm neoversev1 #5108

Add SBGEMM for arm neoversev1 #5108

Conversation

taoye9 commented Feb 5, 2025 • edited Loading

taoye9 commented Feb 7, 2025 • edited Loading

martin-frbg commented Feb 7, 2025

aditew01 commented Feb 11, 2025

martin-frbg commented Feb 11, 2025

aditew01 commented Feb 11, 2025

taoye9 commented Feb 5, 2025 •

edited

Loading

taoye9 commented Feb 7, 2025 •

edited

Loading