Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Slowdown after switch to crypto-bigint #490

Open
fjarri opened this issue Mar 5, 2025 · 5 comments
Open

Slowdown after switch to crypto-bigint #490

fjarri opened this issue Mar 5, 2025 · 5 comments

Comments

@fjarri
Copy link
Contributor

fjarri commented Mar 5, 2025

Benchmarks slowed down several times after #394. What could be the reason?

Performance improvements for boxed uints (RustCrypto/crypto-bigint#777) and the corresponding changes in crypto-primes (entropyxyz/crypto-primes#78) don't really change the results much according to my tests.

sign test time is split ~50/50 between Montgomery exponentiation (with all the time spent in almost_montgomery_mul(), the lowest level function) and BoxedUint::inv_mod(); decrypt is almost exclusively exponentiation. Both of these are the calls in RSA itself; crypto-primes calls take negligible time.

So it seems that either these two functions are somehow much slower than num-bigint (possible, yet unlikely, and straightforward to test), or somehow #394 changed the algorithm to apply them more times than necessary.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 5, 2025

So one difference that I see is that modpow() (and Montgomery multiplication in general) in num-bigint-dig was not, in fact, constant-time. But I would not expect just that account for such a huge difference.

@tarcieri
Copy link
Member

tarcieri commented Mar 5, 2025

Yeah, that's one difference: num_bigint::BigUint has a normalize function which automatically strips leading zeros which is called all over the place.

However, that could also be a clue to the slowdown, namely there may be places where fixed-but-dynamic precision BoxedUints are using a larger size/precision than necessary for a given key size, causing the number of iterations looping through limbs to be much larger than num-bigint (i.e. because they're larger than necessary).

That's my best guess anyway. I'm not sure to sleuth out exactly where that might be happening other than tediously going through line-by-line and comparing the implementations or otherwise looking for unnecessary leading zeros (via e.g. dbg! inspection)

@dignifiedquire
Copy link
Member

The difference comes from the fact that we now do a lot of operation over numbers twice as large as they need to be, at least that is what I remember when I last checked.

if each prime has x bits, we end up doing operations on numbers of the size of 2x bits in a bunch of places, when often times something much smaller would suffice.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 6, 2025

Yep, I just reached that conclusion as well. There are several instances of exponentiation modulo p or q in rsa_decrypt() with the moduli kept in a BoxedUint the size of p * q. That still leaves the inversion, but it's probably the same problem there.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 6, 2025

Btw if anyone wants to deal with this issue, feel free, I don't think I will have enough free time in the coming weeks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants