Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

uint256: optimize Mod, DivMod, AddMod #173

Merged
merged 1 commit into from
May 27, 2024
Merged

Conversation

AaronChen0
Copy link
Contributor

@AaronChen0 AaronChen0 commented May 24, 2024

Lt is branchless and can be inlined, while Cmp has two if branches and can not be inlined. So Lt is better.

To check if a simple function can be inlined or not in golang, run

go build -gcflags='-m -m' |& grep Lt

Learn this trick a few days ago.

Benchmark

goos: linux
goarch: amd64
pkg: github.com/holiman/uint256
cpu: AMD Ryzen 7 7735H with Radeon Graphics         
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Mod/small/uint256-16       4.041n ± 2%   2.865n ± 1%  -29.10% (p=0.000 n=10)
Mod/mod64/uint256-16       22.68n ± 1%   21.39n ± 2%   -5.69% (p=0.000 n=10)
Mod/mod128/uint256-16      39.82n ± 1%   38.23n ± 1%   -4.01% (p=0.000 n=10)
Mod/mod192/uint256-16      36.37n ± 1%   34.69n ± 2%   -4.62% (p=0.000 n=10)
Mod/mod256/uint256-16      28.33n ± 1%   27.08n ± 1%   -4.39% (p=0.000 n=10)
DivMod/small/uint256-16    4.445n ± 1%   3.121n ± 3%  -29.80% (p=0.000 n=10)
DivMod/mod64/uint256-16    22.95n ± 1%   22.12n ± 2%   -3.64% (p=0.000 n=10)
DivMod/mod128/uint256-16   40.21n ± 2%   39.35n ± 0%   -2.16% (p=0.001 n=10)
DivMod/mod192/uint256-16   36.58n ± 2%   35.75n ± 1%   -2.26% (p=0.000 n=10)
DivMod/mod256/uint256-16   28.75n ± 1%   27.68n ± 2%   -3.70% (p=0.000 n=10)
AddMod/small/uint256-16    6.158n ± 2%   4.748n ± 0%  -22.90% (p=0.000 n=10)
AddMod/mod64/uint256-16    9.024n ± 2%   7.413n ± 1%  -17.86% (p=0.000 n=10)
AddMod/mod128/uint256-16   18.00n ± 2%   15.86n ± 1%  -11.89% (p=0.000 n=10)
AddMod/mod192/uint256-16   19.78n ± 2%   17.61n ± 2%  -10.97% (p=0.000 n=10)
AddMod/mod256/uint256-16   6.416n ± 2%   6.422n ± 1%        ~ (p=0.928 n=10)
geomean                    16.63n        14.84n       -10.76%

@AaronChen0 AaronChen0 changed the title uint256: optimize mod, DivMod uint256: optimize Mod, DivMod May 24, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (70cbe2b) to head (1d336e5).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #173   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            5         5           
  Lines         1632      1628    -4     
=========================================
- Hits          1632      1628    -4     

@AaronChen0 AaronChen0 changed the title uint256: optimize Mod, DivMod uint256: optimize Mod, DivMod, AddMod May 25, 2024
Copy link
Owner

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holiman holiman merged commit 8dfcfde into holiman:master May 27, 2024
6 checks passed
@holiman
Copy link
Owner

holiman commented May 27, 2024

Well done!

@AaronChen0 AaronChen0 deleted the mod branch May 27, 2024 08:18
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants