# Validator set size capping strategies _Special thanks to Anders Elowsson and Justin Drake for review_ One weakness of the current Ethereum beacon chain protocol is the great uncertainty in the future computational load that a node will need to handle. The computational load of the beacon chain is roughly proportional to the number of active validators; attestation processing time, epoch transition processing time and memory requirements all scale linearly with active validator count. This creates an unfortunate situation: validators need to have computers powerful enough to handle the load of a theoretical maximum active validator count of ~4 million "just in case", but we don't actually get the decentralization benefit of having 4 million active validators in practice (we have ~400,000 validator spots active [today](https://beaconcha.in/)). This issue becomes even more acute in the context of [single-slot finality](https://notes.ethereum.org/@vbuterin/single_slot_finality#Idea-1-single-slot-finality-through-super-committees), where the beacon chain would need to process an attestation from each validator during each slot. This gap is compounded by an unrelated issue: stakers that have greater than the minimum deposit size are forced to split their stake between different validators, needlessly increasing load. If we assume staker wealth is distributed by [something close to Zipf's law](https://en.wikipedia.org/wiki/Zipf%27s_law), $N$ stakers require $N*log(N)$ validator spots. A quick analysis of [actual data](https://twitter.com/vitalikbuterin/status/1333738057162362880) suggests that near the launch of the beacon chain we had ~8 validator spots per staker. Extrapolating to current total deposit size (just under $\approx 2^{19}$ validators instead of just under $\approx 2^{15}$) implies that we're up to ~10 spots/staker. If we take into account staking pools, we would probably get ~15 spots/staker or even higher. This document describes strategies for mitigating both of these issues. ## Strategy 1: super-committees Randomly choose a subset of the eligible-to-be-active validator set, perhaps around $2^{17}$ validators ~= 4 million ETH, to actually be active validators at any given time (mathematical analysis [suggests](https://notes.ethereum.org/@vbuterin/single_slot_finality#Idea-2-try-really-hard-to-make-very-high-attester-counts-work) that $2^{17}$ validators is near the limit of what can be done without greatly increasing node load or slot times). Adjust this set every time a block is finalized, to make sure all eligible validators get a fair chance to validate about an equal portion of the time in the long run. This approach decreases the _cost of directly 51% attacking the chain_ to 1.3 - 2 million ETH, but the argument is such a cost is already high enough to deter attackers, and we would still have a greater security buffer _in practice_ because an attacker would still need to control at least 33-50% of the entire validator set to launch an attack. The smaller number is merely what the attacker _will have to pay_. Practical problems with this approach include: * Increased protocol complexity * Increased validator reward variance, leading to potentially increased pooling incentives * Incentive compatibility risks around MEV: if MEV is high during some period of time, a current super-committee may intentionally refuse to finalize to keep itself in power for as long as the high MEV is flowing * Slightly weaker, and more complicated-to-explain security model ## Strategy 2: cap the active validator set size We could try to cap the active validator set size, eg. to $2^{20}$ (~33.5 million ETH). This reduces the gap between the actual validator set size and the theoretical maximum validator set size, making the chain's expected future load more predictable. The main question that a validator set cap proposal must answer is: what if more than $2^{20}$ validators _want_ to stake? Some validators' desire to stake must be satisfied, and others must be locked out. How to make this choice? ### Order-based capping One family of approaches is order-based: * **Oldest validators stay (OVS)**: if the validator set is full no one else can join * **Newest validators stay (NVS)**: if the validator set is full the oldest validators get kicked out Each of these has significant problems. OVS risks turning into an entrenched "dynasty" of early stakers, who cannot leave or else they lose their position. It would also lead to either an MEV auction to join every time a validator leaves, or a very long queue to join the validator set. All of these effects would likely create a significant pressure toward liquid staking pools. NVS risks creating a permanent MEV auction that would pollute the chain, as validators that get kicked out would want to immediately rejoin, and would fight with genuine new joiners. ### Economic capping A different strategy is to cap the total deposit size (which in turn implies a cap on the active validator set size) through incentive curves instead of any hard limits. To do this, we change the validator reward formula to something like: $$R = k * (\frac{1}{\sqrt{D}} - \frac{0.5}{\sqrt{2^{25} - D}})$$ Where $R$ is the reward for well-performing validators (badly-performing validators get some lower reward), and $D$ is the total ETH balance of currently active validators. This curve looks roughly as follows: <center> ![](https://storage.googleapis.com/ethereum-hackmd/upload_e83ec1b2df84a51f11ba5b653073c5d8.png) </center> On the left side of the curve, validator rewards function as they do today. But as the total deposited ETH grows to many millions, the reward function starts decaying faster, and at ~25 million ETH it drops below zero. Validators might be willing to continue to stake despite zero or negative rewards in the exceptional case where priority fees and MEV are high enough to entice them to validate and compensate for their losses. The reward curve has an asymptote of negative infinity at $2^{25}$ (~33.5 million) ETH, so the validator set size can't grow beyond that point no matter how high these extrinsic rewards are. The strength of this approach is that it completely avoids strange queueing dynamics: no matter where the equilibrium is, it's an equilibrium; the validator set size is what it is because under the current terms, no more validators _want_ to join. The main weakness is [discouragement attack](https://hackingresear.ch/discouragement-attacks/) dynamics near the right side of the curve: an attacker could enter and quickly drive out other validators. But this is a smaller issue than the issues with the other schemes, because it could only happen during exceptional high-MEV situations and because such attacks would be very expensive and require millions of ETH. ## Strategy 3: variable minimum validator balance Instead of capping the total _amount of ETH deposited_, we could cap the total _number of validator spots_ (eg. to $2^{17}$). If the number of active validators is at the cap, when a new validator joins, the validator with the lowest balance would get kicked out. This approach has historically been a non-starter because the current Ethereum staking design depends heavily on random committee sampling, which breaks down under variable validator balances. However, the [single-slot finality](https://notes.ethereum.org/@vbuterin/single_slot_finality) proposal removes the need to sample by slot, and [Danksharding](https://polynya.medium.com/danksharding-36dc0c8067fe) removes the need to sample by shard. Sync committees and similar miscellaneous small committees would remain, but those are easier to pick even with wildly variable validator balances. Thus, in the future there will be no more need for committee sampling of the type that depends on fixed balances today. After a move to single-slot finality, a move away from fixed-size 32 ETH validators would be completely feasible. In such a design, the validator spot count would effectively define a variable minimum deposit size, which would fluctuate up and down based on demand. If Zipf's Law holds, this would imply that $2^{17}$ validators are enough to stake 33.5 million ETH with a de-facto minimum balance of only ~12-20 ETH. This would all be achieved without greatly changing the economics of staking, and its single-parameter nature ($2^{17}$ validator set cap, instead of $2^{25}$ ETH deposited cap plus $32$ ETH deposit size) makes it very simple. ### More detailed considerations * This approach also **increases fairness for small stakers** because anyone would be able to stake all of their ETH. In the status quo, the requirement to stake in multiples of 32 means that small stakers who have available an amount that is far from (or worse, slightly below) a multiple of 32 ETH would need to leave a substantial portion of their ETH unstaked. Even stakers who start off staking 32 ETH would quickly fall below 100% stake utilization due to compound interest. Variable balances mean that any balance above the minimum is welcome, and everyone gets compound interest automatically. * A **transition** to this mechanism could be made by **rolling it out in two stages**: first increase the maximum deposit size and add a protocol feature to allow validators to consolidate, and then in a second stage add the cap. * Other protocol constraints (eg. ease of sampling) may require us to **keep a deposit size cap but make it much higher** (eg. 2048 ETH). Fortunately, this would actually not affect the numbers much: the overhead from a few very large validators still having to split into multiple 2048-ETH spots is tiny compared to the gains from all the medium-sized validators not having to split into many 32-ETH spots. * One weakness of this approach is that if validator deposits turn out to be much more evenly distributed, the **minimum deposit size could in the extreme case end up higher than 32 ETH**. * Another weakness is that the **full gains may not be realized given certain kinds of bad behavior**. A staking pool could intentionally harm solo stakers by splitting its ETH into many ETH chunks to push up the minimum deposit size. We could **mitigate this by adding a fee per validator spot** (note that this exists already in the form of the gas fees needed to deposit), though there is a tradeoff in that doing this too much increases inequality and adds stake pooling incentives of its own. * If staking pools do consolidate, this pushes the minimum validator size down, creating some **counter-pressure for more decentralization**. ## Conclusions All three approaches unfortunately violate existing validator expectations _in some way_: * Super-committees would decrease average validator rewards, increase variance, and change the security model. * Active validator set caps with a queue-based model would violate expectations that you have "the right to stake" if you meet the terms (you have 32 ETH) * Active validator set caps with economic capping would reduce validator rewards, and in exceptional cases lead to situations where the unluckiest honest validators are faced with _negative_ rewards. * A variable minimum validator balance would have some risk of increasing the required deposit size to higher than 32 ETH However, my present opinion is the latter approach (variable minimum validator balance) has the greatest benefits and the lowest risks. It reduces network load more thoroughly than the other approaches. The possibility that the minimum deposit size will exceed 32 ETH is there, but it is low, because models based on what we know today about staker wealth distribution suggest that the number of individuals staking is well under $2^{17}$. Additionally, even if it turns out to be true, it seems like a less bad consequence than the risks of the the other approaches. More work is required both to evaluate these approaches and determine more conclusively which approach or which combination of approaches (possibly including completely new ideas) is best. Unexpected engineering breakthroughs could also make some tradeoffs more palatable; for example, if we determine that processing $2^{20}$ attestations per slot is feasible, the risk of required deposit sizes exceeding 32 ETH disappears completely. I welcome further discussion both on this topic, and on the closely related issue of [single-slot finality](https://notes.ethereum.org/@vbuterin/single_slot_finality).