simd

title

authors

Summary

A new way to distribute epoch rewards across multiple blocks is proposed to address the current performance problems associated with epoch reward distribution in a first block of a new epoch.

Motivation

The distribution of epoch rewards at the start block of an epoch becomes a significant bottleneck due to the rising number of stake accounts and voting nodes on the network.

To address this bottleneck, we propose a new approach for distributing the epoch rewards over multiple blocks.

New Terminology

rewards calculation: calculate the epoch rewards for all active stake accounts
rewards distribution: distribute the epoch rewards for the active stake accounts

Alternatives Considered

We have discussed the following alternative approaches.

Simply set a threshold on stake balance, and limit the epoch rewards to the accounts with stake balance above the threshold. This will effectively reduce the number of stake rewards to be distributed, and reduce reward distribution time at the epoch boundary. However, this will impact stake accounts with small balance. To receive rewards, the small stake accounts will be forced to join stake pools, which some may be hesitant to do.
Handle reward distributions through transactions with specialized native reward programs. While this approach is optimal, it requires significant modifications and doesn't align with the current reward code.
A completely asynchronous epoch rewards calculation and distribution, in which both reward computation and rewards distribution are asynchronous. This is the most general approach. However, it is also the most complex approach. The reward computation states would have to be maintained across multiple blocks. The transition between the reward calculation and reward distribution would need to be synchronized across the cluster. And cluster restart during reward computation would have to be handled specially.
An other approach is similar to the current proposal with additional per-block reward reserve sysvar accounts. Those sysvar accounts are introduced to track and verify the rewards distributed per block. The per-block reward reserve sysvar accounts add additional checks and safety for reward distribution. However, they also add addition cost to block processing, especially for the first block in the epoch. The first block is already computationally heavily - it is responsible for processing all the reward computation. The additional cost of those sysvars puts more burden onto that block and hurt the timing for it.

Detailed Design

The major bottleneck for epoch reward distribution is to distribute rewards to stake accounts. At the time of writing, there are approximately 550K active stake accounts and 1.5K vote accounts on Solana Mainnet Beta. Given the relatively small number of vote accounts, it makes sense to keep vote rewards distribution mechanism unchanged. They can still be distributed efficiently at the first block of the epoch boundary. This reduces the impact of rewards for vote account and also simplifies the overall changes. It also lets us focus on solving the primary bottleneck - Stake Rewards. Only Stake rewards are going to be distributed out over multiple blocks.

In the new stake rewards distribution approach, we will separate the computation of rewards from the actual distribution of rewards at the epoch boundary by dividing the process into two distinct phases:

rewards calculation phase - during which the epoch rewards for all activate stake accounts are computed and distribution chunks are scheduled.
rewards distribution phase - during which the calculated epoch rewards for the active stake accounts are distributed.

To help maintain the total capital balance and track/verify the reward distribution during rewarding phases, a new sysvar account, EpochRewards, is proposed. The EpochRewards sysvar holds the balance of the rewards that are pending for distribution.

Rewards Calculation

Reward calculation phase computes all the rewards that need to be distributed for the active stake accounts, and partitions the reward into a number of chunks for distribution in the next phase.

Currently, on Solana Mainnet Beta with ~550K active stake accounts, it shows that epoch reward calculation takes around 10 seconds on average. This will make it impossible to perform rewards computation synchronous within one block.

However, there are quite a few promising optimizations that can cut down the reward computation time. An experiment for reward calculation optimization (solana-labs/solana#31193) showed that we can cut the reward calculation time plus vote reward distribution time to around 1s. This makes synchronous reward computation and asynchronous reward distribution a feasible approach. We also believe that there is still more rooms for more optimization to further cut down the above timing.

Therefore, the following design is based on the above optimization. The reward calculation will be performed at the first block of the epoch. Once the full rewards are calculated, the rewards will be partitioned into distribution chunks stored in the bank, which will then be distributed during the reward distribution phase.

To ensure that each block distributes a subset of the rewards in a deterministic manner for the current epoch, while also randomizing the distribution across different epochs, the partitioning of all rewards will be done as follows.

To minimize the impact on block processing time during the reward distribution phase, a target of 4,096 stake rewards will be distributed per block. The total number of blocks M needed to distributed rewards is given by the following formula to round up to the nearest integer without using floating point value arithmetic:

M = ((4096 - 1)+num_stake_accounts)/4096

To safeguard against the number of stake accounts growing dramatically and overflowing the number of blocks in an epoch, the number of blocks is capped at 10% of the number of block in an epoch (currently 432,000).

The SipHash 1-3 pseudo-random function is used to hash stake account addresses efficiently and uniformly across the blocks in the reward distribution phase. The hashing function for an epoch is created by seeding a new SipHasher with the parent block's bank hash. This hashing function can then be used to hash each active stake account's address into a u64 hash value. The reward distribution block index I can then be computed by applying the following formula to the hash:

I = (M * stake_address_hash) / 2^64

Reward Distribution Snapshot State

An additional field epoch_rewards_status will be added to serialized bank structure so that the full list of stake rewards calculated in the first epoch block can be recovered inside snapshots created during reward distribution. This data is very large so cannot be stored on-chain.

The layout of the EpochRewardsStatus field is as follows:

enum EpochRewardStatus {
    /// this bank is in the reward phase.
    Active(StartBlockHeightAndRewards),
    /// this bank is outside of the rewarding phase.
    #[default]
    Inactive,
}

struct StartBlockHeightAndRewards {
    /// the block height of the slot at which rewards distribution began
    pub(crate) start_block_height: u64,
    /// calculated epoch rewards pending distribution, outer Vec is by partition
    /// (one partition per block)
    pub(crate) stake_rewards_by_partition: Arc<Vec<Vec<StakeReward>>>,
}

struct StakeReward {
    pub stake_pubkey: Pubkey,
    pub stake_reward_info: RewardInfo,
    pub stake_account: AccountSharedData,
}

struct RewardInfo {
    pub reward_type: RewardType,
    /// Reward amount
    pub lamports: i64,
    /// Account balance in lamports after `lamports` was applied
    pub post_balance: u64,
    /// Vote account commission when the reward was credited, only present for
    /// voting and staking rewards
    pub commission: Option<u8>,
}

enum RewardType {
    Fee,
    Rent,
    Staking,
    Voting,
}

`EpochRewards` Sysvar Account

EpochRewards sysvar account records the total rewards and the amount of distributed rewards for the current epoch internally. And the account balance reflects the amount of pending rewards to distribute.

The layout of EpochRewards sysvar is shown in the following pseudo code.

struct EpochRewards {
   // total rewards for the current epoch, in lamports
   total_rewards: u64,

   // distributed rewards for  the current epoch, in lamports
   distributed_rewards: u64,

   // distribution of all staking rewards for the current
   // epoch will be completed before this block height
   distribution_complete_block_height: u64,
}

The EpochRewards sysvar is created at the start of the first block of the epoch (before any transactions are processed), as both the total epoch rewards and vote account rewards become available at this time. The distributed_reward_in_lamport field is updated per reward distribution for each block in the reward distribution phase.

Once all rewards have been distributed, the balance of the EpochRewards account MUST be reduced to 0 (or something has gone wrong) at the beginning of the last rewards distribution block (before processing transactions). Any extra lamports in EpochRewards accounts will be burned after reward distribution phase, and the sysvar account will be deleted.

There are two possible reasons why the sysvar balance could be non-zero after reward distribution:

The sysvar can only be made read-only after the activation of a feature in accordance with SIMD 105. Before it's read-only, it will be possible for anyone to send lamports to this account.
During reward distribution, it's possible that the sysvar balance is reduced below its rent-exempt minimum balance. Similar to other sysvars, whenever this sysvar is updated, its balance will be topped up to the rent-exempt minimum.

Because the lifetime of EpochRewards sysvar coincides with the reward distribution interval, users can explicitly query the existence of this sysvar to determine whether a block is in reward interval. Therefore, no new RPC method for reward interval is needed. Similar to other sysvars, a new syscall named sol_get_epoch_rewards_sysvar will be created to allow programs to fetch this sysvar in the SVM.

Reward Distribution

Reward distribution phase happens after reward computation phase, which starts after the first block in the epoch for this proposal. It lasts for M blocks. Each of the M blocks is responsible for distributing the reward from one partition of the rewards from the EpochRewards sysvar account.

Before each reward distribution, the EpochRewards account's balance is checked to make sure there is enough balance to distribute the reward. After a reward is distributed, the balance in EpochRewards account is decreased by the amount of the reward that was distributed.

Restrict Stake Account Access

To avoid programs interfering with reward distribution, the runtime rejects transactions that attempt to write-lock a stake program owned account during the epoch reward distribution period regardless of whether that stake account is active or not. This is because reward distribution completely overwrite stake account state and any changes would be lost.

Any transaction that invokes staking program during this period will be dropped without deducting fees due to this new error code:

TransactionError::ProgramExecutionTemporarilyRestricted {
  account_index: u8,
}

That means all updates to stake accounts have to wait until the rewards distribution finishes.

Because different stake accounts are receiving the rewards at different blocks, on-chain programs, which depend on the rewards of stakes accounts during the reward period, may get into partial epoch reward state. To prevent this from happening, loading stake accounts from on-chain programs during reward period will be disabled. However, reading the stake account through RPC will still be available.

Impact

There are the two main impacts of the changes to stake accounts during the epoch rewarding phase.

The first impact is that stake accounts will see their rewards being credited a few blocks later in the epoch than before.

The second impact is that users will not be able to update their stakes during the epoch reward phases, and will have to wait until the end of the epoch reward period to make any changes. This includes lamport transfers to their stake accounts so things like Jito MEV tip distribution will also need to wait until the end of the epoch reward period to distribute tip earnings.

Nonetheless, the overall amount of time that the user must wait before receiving and updating their stake rewards should be roughly equivalent to what they are now experiencing on the current mainnet beta, since the prolonged first block time at the epoch boundary effectively obstructs the user's access to those stake accounts during that time.

Another advantage with the new approach is that all non-staking transactions will continue to be processed, while those transactions are blocked on mainnet beta today.

Security Considerations

While the proposed new approach does impact and modify various components of the validators, it does not alter the economics of the reward system.

Reward distribution relies on completely restricting any lamport balance changes for stake accounts until distribution is completed.

Reward distribution state should be recoverable from snapshots produced during the reward distribution period to avoid consensus failure.

Backwards Compatibility

This is a breaking change. The new epoch calculation and distribution approach will not be compatible with the old approach.

Snapshot format changes due to new the bank serialized field will also be made backwards compatible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0015-partitioned-epoch-reward-distribution.md

0015-partitioned-epoch-reward-distribution.md

Summary

Motivation

New Terminology

Alternatives Considered

Detailed Design

Rewards Calculation

Reward Distribution Snapshot State

`EpochRewards` Sysvar Account

Reward Distribution

Restrict Stake Account Access

Impact

Security Considerations

Backwards Compatibility

Open Questions

Files

0015-partitioned-epoch-reward-distribution.md

Latest commit

History

0015-partitioned-epoch-reward-distribution.md

File metadata and controls

Summary

Motivation

New Terminology

Alternatives Considered

Detailed Design

Rewards Calculation

Reward Distribution Snapshot State

EpochRewards Sysvar Account

Reward Distribution

Restrict Stake Account Access

Impact

Security Considerations

Backwards Compatibility

Open Questions

`EpochRewards` Sysvar Account