Skip to content

Add Neon NTT #203

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add Neon NTT #203

wants to merge 5 commits into from

Conversation

mkannwischer
Copy link
Contributor

@mkannwischer mkannwischer commented May 3, 2025

I have not run it through SLOTHY yet. I'll do that in a follow-up PR.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Benchmark suite Current: 8fda85e Previous: 3443561 Ratio
ML-DSA-44 keypair 490685 cycles 490045 cycles 1.00
ML-DSA-44 sign 1854022 cycles 1854795 cycles 1.00
ML-DSA-44 verify 563360 cycles 563551 cycles 1.00
ML-DSA-65 keypair 829537 cycles 829847 cycles 1.00
ML-DSA-65 sign 2970020 cycles 2971899 cycles 1.00
ML-DSA-65 verify 880555 cycles 880975 cycles 1.00
ML-DSA-87 keypair 1345789 cycles 1343166 cycles 1.00
ML-DSA-87 sign 3774144 cycles 3772895 cycles 1.00
ML-DSA-87 verify 1433636 cycles 1433664 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Benchmark suite Current: 8fda85e Previous: 3443561 Ratio
ML-DSA-44 keypair 132460 cycles 132396 cycles 1.00
ML-DSA-44 sign 411876 cycles 411635 cycles 1.00
ML-DSA-44 verify 142608 cycles 142566 cycles 1.00
ML-DSA-65 keypair 231387 cycles 231322 cycles 1.00
ML-DSA-65 sign 679524 cycles 679473 cycles 1.00
ML-DSA-65 verify 231551 cycles 231676 cycles 1.00
ML-DSA-87 keypair 381567 cycles 381535 cycles 1.00
ML-DSA-87 sign 877936 cycles 878007 cycles 1.00
ML-DSA-87 verify 391407 cycles 391350 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)

Benchmark suite Current: 8fda85e Previous: 3443561 Ratio
ML-DSA-44 keypair 224150 cycles 224243 cycles 1.00
ML-DSA-44 sign 654841 cycles 654969 cycles 1.00
ML-DSA-44 verify 240715 cycles 240707 cycles 1.00
ML-DSA-65 keypair 400314 cycles 400195 cycles 1.00
ML-DSA-65 sign 1074694 cycles 1073293 cycles 1.00
ML-DSA-65 verify 394290 cycles 394266 cycles 1.00
ML-DSA-87 keypair 647216 cycles 646992 cycles 1.00
ML-DSA-87 sign 1394731 cycles 1395196 cycles 1.00
ML-DSA-87 verify 661660 cycles 661462 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Benchmark suite Current: 8fda85e Previous: 3443561 Ratio
ML-DSA-44 keypair 1026839 cycles 1025279 cycles 1.00
ML-DSA-44 sign 3900331 cycles 3899447 cycles 1.00
ML-DSA-44 verify 1164942 cycles 1164811 cycles 1.00
ML-DSA-65 keypair 1727119 cycles 1729203 cycles 1.00
ML-DSA-65 sign 6377417 cycles 6384540 cycles 1.00
ML-DSA-65 verify 1864556 cycles 1863531 cycles 1.00
ML-DSA-87 keypair 2848223 cycles 2852215 cycles 1.00
ML-DSA-87 sign 7952701 cycles 7949346 cycles 1.00
ML-DSA-87 verify 3034841 cycles 3034846 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Benchmark suite Current: 8fda85e Previous: 3443561 Ratio
ML-DSA-44 keypair 310005 cycles 309866 cycles 1.00
ML-DSA-44 sign 976974 cycles 1091537 cycles 0.90
ML-DSA-44 verify 334029 cycles 334798 cycles 1.00
ML-DSA-65 keypair 566219 cycles 568405 cycles 1.00
ML-DSA-65 sign 1584266 cycles 1636741 cycles 0.97
ML-DSA-65 verify 543612 cycles 542674 cycles 1.00
ML-DSA-87 keypair 875101 cycles 875559 cycles 1.00
ML-DSA-87 sign 2057254 cycles 2044313 cycles 1.01
ML-DSA-87 verify 902044 cycles 901713 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 6141304 Previous: 3443561 Ratio
ML-DSA-44 sign 1130374 cycles 1091537 cycles 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Benchmark suite Current: 8fda85e Previous: 535dc05 Ratio
ML-DSA-44 keypair 295530 cycles 304742 cycles 0.97
ML-DSA-44 sign 968878 cycles 991066 cycles 0.98
ML-DSA-44 verify 299510 cycles 329368 cycles 0.91
ML-DSA-65 keypair 548558 cycles 561239 cycles 0.98
ML-DSA-65 sign 1428807 cycles 1544501 cycles 0.93
ML-DSA-65 verify 499443 cycles 537517 cycles 0.93
ML-DSA-87 keypair 850185 cycles 866585 cycles 0.98
ML-DSA-87 sign 1910316 cycles 1995595 cycles 0.96
ML-DSA-87 verify 845528 cycles 894378 cycles 0.95

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer force-pushed the neon-ntt branch 3 times, most recently from 7bac122 to db4a9e0 Compare May 3, 2025 06:37
@mkannwischer mkannwischer changed the title [DRAFT]: Add Neon NTT Add Neon NTT May 3, 2025
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
@mkannwischer mkannwischer marked this pull request as ready for review May 3, 2025 07:06
@mkannwischer mkannwischer requested a review from a team as a code owner May 3, 2025 07:06
@mkannwischer
Copy link
Contributor Author

@hanno-becker - could you please take a look at this one?
I kept the changes to the assembly (from SLOTHY) as separate commits so that it's easier to review. Before we merge, I'd like to squash all commits changing the asm.

@@ -68,7 +68,7 @@ jobs:
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: ./.github/actions/bench
if: ${{ matrix.target.only_no_opt == 'false' }}
if: ${{ !matrix.target.only_no_opt }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is going to work? I always get confused what is a string and what is a boolean working with GH actions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't working before - this step was never run - that's why I changed it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Neon backend: Add NTT
2 participants