Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add SBGEMM for arm neoversev1 #5108

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

taoye9
Copy link
Contributor

@taoye9 taoye9 commented Feb 5, 2025

This PR is to add an optimised bf16 gemm kernel for arm neoversev1 machine (sve-256 bit).

@taoye9 taoye9 marked this pull request as draft February 5, 2025 14:53
@taoye9 taoye9 marked this pull request as ready for review February 6, 2025 10:48
@taoye9
Copy link
Contributor Author

taoye9 commented Feb 7, 2025

@martin-frbg hi, martin, seems my patch fails on a irrelevant mips64 ci pipeline and after printing out TEST 109/111 kernel_regress:skx_avx [OK] due to exceeding maximum execution time. Could you help me on understanding whether this is expected or my changes accidentally cause performance issues?

it seems some previous pr also fails after this kernel_regress:skx_avx test.

@martin-frbg martin-frbg added this to the 0.3.30 milestone Feb 7, 2025
@martin-frbg
Copy link
Collaborator

Thanks - no problem with your PR, just me lacking the time and energy to confirm and merge it

@martin-frbg martin-frbg merged commit 1b85b6a into OpenMathLib:develop Feb 7, 2025
85 of 86 checks passed
@aditew01
Copy link
Contributor

@martin-frbg Do we have any release cadence to upgrade the version to 0.30.0?

@martin-frbg
Copy link
Collaborator

@aditew01 0.3.30 is planned for end of the month (please see Milestones)

nSircombe added a commit to nSircombe/Tool-Solutions that referenced this pull request Feb 11, 2025
- Updates changelog
- Removes pytorch/pytorch#139387 - Add prepacking
  for linear weights. Performance gains better realised by ideep reorder
  caching.
- Updates OpenBLAS build to use recent commit from develop which
  includes: OpenMathLib/OpenBLAS#5108
@aditew01
Copy link
Contributor

@martin-frbg that sounds great! Thanks for the clarification. Missed the milestone update, but I'd keep a look out. :)

nSircombe added a commit to nSircombe/Tool-Solutions that referenced this pull request Feb 12, 2025
- Updates changelog
- Removes pytorch/pytorch#139387 - Add prepacking
  for linear weights. Performance gains better realised by ideep reorder
  caching.
- Updates OpenBLAS build to use recent commit from develop which
  includes: OpenMathLib/OpenBLAS#5108
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants