-
Notifications
You must be signed in to change notification settings - Fork 4.7k
added alt_bn syscalls #27961
added alt_bn syscalls #27961
Conversation
@dmakarov do yiou know why this pr is exceeding the test-stable-bpf failing due to the build dependency cap? |
It looks like mem, sanity, and simulation (at least) recompile solana-program redundantly. Probably some of the crates, that |
Thanks! It looks like all three mem, sanity, and simulation, import solana-program-runtime, solana-program-test, solana-sdk as dev dependencies, maybe these trigger the rebuild? Dependencies within ark_* crates look consistent on first glance, but import some crates with different features than solana-program does:
num-traits:
itertools is imported twice btw once in [dependencies], once in [target.'cfg(not(target_os = "solana"))'.dependencies]. I will try whether importing ark_* crates with default features=false helps, compile them one by one, and adjust the imports in ark_ crates. |
@dmakarov, @jackcmay , I made it the test run by adapting the dependency versions which resulted in the recompile. This lead to forking the ark-* libraries thus currently the PR imports the ark-* libraries from a github repo. Does this import work for you? Otherwise maybe it would be cleaner to just bump the number of recompiles up again to 10. I saw that the number was decreased from 10 to 3 in this commit ten days ago. Now looking into the timeout of the downstream projects, if you have any idea regarding this please lmk. |
Let me experiment with the dependencies a little and maybe I'll be able to keep the rebuilds at the minimum without having to modify ark-* crates. I'll let you know about my results. The timeout does happen sometimes. I restarted the job, let's see if it passes. |
Sounds good. That would be the best way of course, thanks! |
@dmakarov did you get a chance to experiment yet? |
Yes, still working on it. These changes reduce the number of dependencies with different fingerprints, but not all of them
Although It's a bit of nuisance to find which features cause the fingerprints differences. I still have these differences when mem is built after 128bit has been built
I need to modify |
great thanks! see: |
There's some This is the tree of
And this is the tree of
We probably don't want to include dependencies to github repositories in solana-program Cargo.toml, so I'm willing to relax for now the CI check that counts the number of solana-program rebuilds. Could you please rebase your PR to most recent master, remove the latest commit that changes ark-* dependencies to github repositories, and add a commit that changes the Line 122 in 981c9d0
|
Ok, thank you! I will revert the git imports, rebase and adapt the ci test accordingly. |
0657c1e
to
e719986
Compare
The changes are implemented, all tests passed prior squashing the commits and pushing again. Then I will mark the Pr to ready for review. |
9468682
to
84f9e86
Compare
@Lichtso @alessandrod , cc: @jackcmay following up this. This PR targets master here as discussed and was previously at: #27659 |
validator/solana-test-validator
Outdated
@@ -2,4 +2,4 @@ | |||
|
|||
here="$(dirname "$0")" | |||
set -x | |||
exec cargo run --manifest-path="$here"/Cargo.toml --bin solana-test-validator -- "$@" | |||
exec cargo run --release --manifest-path="$here"/Cargo.toml --bin solana-test-validator -- "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should take this out and only tests it locally in release mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Ok, will remove this. Just tested it again and the local test-validator built from source works fine without release mode.
Before I push it, is the Pr ready to be merged apart from this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
84f9e86
to
cd35394
Compare
We have three different syscalls with the same signature, can we maybe combine them into one with an enum to select the operation? There is still two unused parameter slots available to do so. |
Thanks for the feedback, makes sense. Merged the three syscalls into one using const u64s (ADD, MUL, PAIRING) and match similarly to the 25519 syscalls. Constants are easier to use than an enum since the declare syscall macro expects u64 in the empty slots, otherwise I need to convert every time. ADD and MUL are compatible with the constants implemented in curve_syscall traits. I didn't import those for it incurs a circular dependency. What do you think? |
Yes, great. |
What I meant was merging the code in let budget = invoke_context.get_compute_budget();
let (cost, output_len, calculation) = match group_op {
ALT_BN128_ADD => (
budget.alt_bn128_addition_cost,
ALT_BN128_ADDITION_OUTPUT_LEN,
alt_bn128_addition,
),
ALT_BN128_MUL => (
budget.alt_bn128_multiplication_cost,
ALT_BN128_MULTIPLICATION_OUTPUT_LEN,
alt_bn128_multiplication,
),
ALT_BN128_PAIRING => {
let ele_len = input_size.saturating_div(ALT_BN128_PAIRING_ELEMENT_LEN as u64);
let cost = budget
.alt_bn128_pairing_one_pair_cost_first
.saturating_add(
budget
.alt_bn128_pairing_one_pair_cost_other
.saturating_mul(ele_len.saturating_sub(1)),
)
.saturating_add(budget.sha256_base_cost)
.saturating_add(input_size)
.saturating_add(ALT_BN128_PAIRING_OUTPUT_LEN as u64);
(cost, ALT_BN128_PAIRING_OUTPUT_LEN, alt_bn128_pairing)
},
};
invoke_context.get_compute_meter().consume(cost)?;
let input = translate_slice::<u8>(
memory_mapping,
input_addr,
input_size,
invoke_context.get_check_aligned(),
invoke_context.get_check_size(),
)?;
let call_result = translate_slice_mut::<u8>(
memory_mapping,
result_addr,
output_len as u64,
invoke_context.get_check_aligned(),
invoke_context.get_check_size(),
)?;
let result_point = match calculation(input) {
Ok(result_point) => result_point,
Err(e) => {
return Ok(e.into());
}
};
if result_point.len() != output_len {
return Ok(AltBn128Error::SliceOutOfBounds.into());
}
call_result.copy_from_slice(&result_point);
Ok(SUCCESS) |
Done, essentially used your example, just boxed the passed function to get around incompatible return types in match arms. type DynAltBnFunction = Box<dyn for<'r> Fn(&'r [u8]) -> Result<Vec, AltBn128Error>>; What do you think? |
I don't think |
Weird, apparently it works if you only return the function but not a triple.
It stays the same adding Just adding dyn results in unknown size at compile time error which is solved with Box. |
Then just split it in two |
Makes sense. Done, I split it into two and moved the second match to the code which uses the calculation. |
I checked the ComputeBudget calculations and got the performance that is about 10 times lower. Can you check your bench results ? |
just ran the benches myself and had the same issue change the branch to alt_bn128_precompiles_arkworks then you can replicate the results. I changed the library in July to arkworks and did not update the default branch in the bench repo. |
@Lichtso the 2 match expressions are implemented. |
sdk/program/src/alt_bn128.rs
Outdated
} | ||
|
||
fn convert_edianness_64(bytes: &[u8]) -> Vec<u8> { | ||
let mut vec = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace by .flatten().collect::<Vec<u8>>()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done see 8bd12f4
Looking good, just needs a rebase onto master and we can restart the CI. |
ff5e4ab
to
af2ad66
Compare
Perfect, it is rebased. Feel free to restart the CI. |
* added alt_bn128_syscalls * increased regression build redundancy to > 10
Compatibility
This is an updated version of previous pull request (#26629)
Credit for the previous pull request is due to @ai-trepalin.
Problem
It is very expensive to compute elliptic curve operations over the bn256(bn254/bn128) curve on Solana. The following curve operations enable for example efficient verification of zero-knowledge proofs within one transaction.
Additionally, contracts written in the Solidity language cannot work in Solana if they contain calls to the following precompiled contracts:
bn256Add — Performs addition on the elliptic curve operations.
bn256ScalarMult — Performs scalar multiplication on the elliptic curve operations.
bn256Pairing — Elliptic curve pairing operations to perform zkSNARKs verification within the block gas limit.
https://github.com/ethereum/EIPs/blob/master/EIPS/eip-196.md
https://github.com/ethereum/EIPs/blob/master/EIPS/eip-197.md
The Neon EVM requires the implementation of system calls in Solana for these contracts.
Also, Light Protocol requires the implementation of these syscalls in Solana for more efficient zero-knowledge proof verification.
Solution
In order for the precompiled contracts to be used in Solana, it is proposed to implement sys-calls inside the Solana core. That is, to perform an implementation similar to the erc-recover implementation.
Summary of Changes
This merge request adds implementation of bn256 (alt_bn128) precompiles.
About compute budget costs
Add and ScalarMult are implemented in constant time.
Execution time for the pairing implementation follows a linear equation:
pairingCU = firstPairingExecutionTime + (numberOfPairings - 1) * otherPairingsExecutionTime
This equation computes the compute units charged (implementation).
The equation results from the arkwork products of pairings implementation which executes a miller loop for every pairing and only one final exponentiation.
In Ethereum gas costs for the pairing operation are calculated similarly.
Possible # based on 33ns execution time per CU:
Addition: 334 CU
Multiplication: 3,840 CU
Pairing: 36,364 CU + (numberOfPairings - 1) * 12,121 CU
Comput units for one G1 multiplication: 126.71 * 1000 / 33 = 3,840 CU
Compute units for the first pairing: 1.2ms * 1_000_000 / 33 (ns per CU) = 36,364 CU
Compute units for one additional pairing: 0.4ms * 1_000_000 / 33 (ns per CU) = 12,121 CU
(Both 1.2ms and 0.4ms have been selected as upper bounds with buffer based on the bench results below.)
Bench Results:
Benchmarks were conducted on an aws c6a.2xlarge instance, see detailed bench results and bench implementation.
Summary:
G1 addition time:: [4.1247 us 4.1553 us 4.1903 us]
Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low severe 2 (2.00%) high mild 3 (3.00%) high severe
G1 mul time:: [126.64 us 126.68 us 126.71 us]
Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 3 (3.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe
2 pairings time: [1.3630 ms 1.3718 ms 1.3839 ms]
Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) high mild 4 (4.00%) high severe
3 pairings time: [1.6807 ms 1.6819 ms 1.6835 ms]
Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) high mild 5 (5.00%) high severe
4 pairings time: [1.9977 ms 1.9980 ms 1.9983 ms]
Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild
5 pairings time: [2.3224 ms 2.3227 ms 2.3230 ms]
Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild
Fixes #