Mitigation of ex ante reorgs

# Mitigation of ex ante reorgs As discussed [here](https://notes.ethereum.org/YvW57fUcTKqTRmyzAWb4Jw) and more formally [here](https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Fethrig%2FjiaI9X0AtR.pdf?alt=media&token=a2d910c6-07c7-4c15-875d-855ef887f310), the [current specifications](https://github.com/ethereum/consensus-specs) of Proof-of-Stake Ethereum allow for reorgs to happen relatively easily. A mitigation must be found before *The Merge*. There are several and open/closed PRs ([#1](https://github.com/ethereum/consensus-specs/pull/2353), [#2](https://github.com/ethereum/consensus-specs/pull/2197), [#3](https://github.com/ethereum/consensus-specs/pull/2101), ...) floating around, but nothing definite yet. This note tries to keep track of various different ideas to mitigate these ex ante reorgs and also why some ideas have been discarded already. ## Idea 1 - (block, slot)-voting #### What is (block, slot)-voting? The core idea of (block, slot)-voting is to have honest committee members attest to an empty slot, if they do not see a timely proposal. Currently attestor would attest to the latest block as current head of the chain. It is [specified here](https://github.com/ethereum/consensus-specs/pull/2197) and described by Vitalik [here](https://notes.ethereum.org/@vbuterin/HF1_proposal). At its core this would allow validators to vote for empty slots (if they don't hear a proposal). #### Why would it help mitigate reorgs? This would prevent reorgs from working (or at least being practically feasible), because the adversary's blocks would not inherit the weight of honest attestations. They would instead accrue to the empty slot, which is in competition with the adversary's privately kept block. Hence, the adversary would compete with all honest committee members. #### What is the problem with (block, slot)-voting? This comes with a fundemental problem though: (block, slot)-voting introduces a `SECONDS_PER_SLOT \ 3` seconds latency constraint on 'liveness' (slight abuse of term here). Put differently, if honest validators do not hear a proposal in time (3 seconds into slot), they will vote for an empty slot. This implies that if - for whatever reason - there is latency greater than 3 seconds, even an honest but slightly late block would not make it into the canonical chain. So while technically the chain is still making progress (voting on empty slots and finalizing them), it's practically of no use to the user because no transactions are included in the canonical chain. #### IMPORTANT: (block,slot)-voting not helpful with boosting active > Stanford group: We checked out the (block, slot) proposal. Maybe someone can clarify where the following mix of sandwich and balancing attack fails: `Wp=0.8`, `beta=0.05`. Slot 0 genesis, slots 1, 2, 3, 4 are Adversarial, Adversarial, Honest, Adversarial. During adversarial slots, the adversary shares a delayed proposal with half of the network before the vote and with half of the network after the vote -- so half vote for the proposal and half vote for empty slot (proposal weights don't help), building two forks of length 2 with equal votes. The honest block will be proposed on either, will get all votes (because proposal weights). Once the proposal weight is gone, the chain with the honest proposal has `0.475*W*2 + 0.95*W` votes, the other chain has `0.475*W*2` votes. So far no adversary votes have been cast. In the final slot, the adversary casts the withheld `0.15*W` votes for the weaker chain and proposes on it, leading to `1.90*W` vs `1.90*W` and hence completing the reorg of the honest block. If above goes through, the reorg resilience is the same as under proposer weights alone (5% adversary can do at least in expectation one reorg a day at `Wp=0.8`). ![](https://storage.googleapis.com/ethereum-hackmd/upload_9547a62bdba42990b6a53250bf0ac9a5.jpeg) It is worth noting though that the description is somewhat idealized in the sense that it is assuming "perfect" p2p timing (to split honest validators in half). Arguably another $3\sqrt{honest\_committee\_size}$ of adversarial notes is required to re-balance any discrepancies of the 50-50 splits. Similiar to ex ante reorg paper... Assuming 8000 validators per slot (256k in total), that only requires an additional ~0.1% of total stake for the adversary. ### Ideas to make (block,slot)-voting still viable #### Backoff scheme using missing blocks as trigger Danny suggested the following: > You could maybe do a backoff as blocks are missing. That is, in normal case validators attest at slot N for optimal inclusion in N+1. But if a proposal is missing, could back off when earliest inclusion of attestation is. E.g. attest to slot N in slot N+1 for earliest inclusion on N+2. And can get more and more backed off. It's a complex consensus state transition change. But something like that would have to compliment anything that looked like (block,slot)-voting. However: > I was thinking about how you might apply and exponential back-off to block-slot, but you quickly get the same issues that exist in current fork choice when you have sequential block proposals. e.g., you own proposals N and N+1. You purposefully make N late. Now N+1 has 2 slots until considered empty slot rather than one. Now N+1 can be used in similar re-org attacks against N+2 That’s what led me to consider if instead of a back-off, it was binary — block-slot on or off — and it was based on a higher sample set than just late blocks, thus attestations #### Increase slot time upon oberserving empty slots Idea: If there is an empty slot, increase slot duration every successive round until there is a block again. This way we ensure “liveness” (abuse of term here but you get the idea) while having benefits of (block, slot)-voting. Tendermint does something [similar-ish](https://docs.tendermint.com/master/spec/consensus/consensus.html#state-machine-overview)(?): > Some of these problems are resolved by moving onto the next round & proposer. Others are resolved by increasing certain round timeout parameters over each successive round. Anyways. Thoughts/Problems that come to mind: - Local views can diverge: some validators might hear a block and attest to it and others won’t hear it and vote for an empty slot (and thus subsequently expect longer slot times). Is that a problem? - you can’t differentiate a proposer that’s just offline and bad network conditions. so if a proposer is offline the chain will make progress more slowly even if it’s not necessary… - immediate worry: you spam the network to have a longer slot by the time you’re proposing to have more MEV opportunities (no idea how expensive spamming p2p is) Overall, this probably could work, but requires changes that are far more complex than we are willing to do at this point. #### Backoff if locally observed latency >4s > Danny: Just a shower thought. Mainly trying to think outside the box to find new designs that don't require consensus rearchitecting Use block- slot as your fork choice unless for some period of time (e.g. a few slots), you see 50%+ messages arriving later than 4s of where they should be. Then switch to current lmd ghost until see good latencies again Idea is that, it's good to enshrine an empty slot via fork choice if a single proposer or a couple is late (and might even be an attempted attack if high latency but isolated) but network is generally working well. Attacker of less than 1/3 (i assume...) cant make a majority of network see 50%+ messages as late, so cant put network into standard ghost mode purposefully to profit off of reorgs and balance attacks If network actually does enter into high latency mode due to actual high latencies across both blocks and attestations, it sacrifices anti-reorg properties for chain growth until latencies become normal again The ability for an attacker to potentially split the network between the modes is certainly a concern but we can analyze what type of attacker could do that and what additional attacks open up because of it #### Backoff scheme using on-chain attestations > Danny et al.: -> normal mode: block-slot fork choice -> observe 1 slot latencies: block-slot but a vote at slot N applies to slot `N - N%2` in that chain -> next backoff, same but `N - N%4` Etc Consensus, voting, chain mechanisms, etc remain stable Its that your vote applies weight to an earlier depth if a backoff Current thought is to use onchain signals for latency E.g. how long attestations take to get onchain over ~8 slots ## Idea 2 - Proposer weight boosting https://github.com/ethereum/consensus-specs/pull/2353 > Aditya: A new "proposer score boost" has been introduced, which boosts the LMD score of blocks that are received in a timely manner. The boost increases the score of the block by `committee_weight // 4` for that slot. If a block for slot X is received in the first `SECONDS_PER_SLOT // ATTESTATION_OFFSET_QUOTIENT` (= 4) seconds of the slot, then the block receives the boost for the duration of slot X. After slot X is over, the boost is removed and the usual LMD score calculations from only attestations are done. Basically the proposer gets more power when behaving timely to tip the chain in one direction to prevent balancing attacks. But it also helps to mitigate ex ante reorgs. However, the adversary can still do sandwiching attacks: > Our analysis says the proposer weight Wp should be ~80% of committee weight W for strongest security. If that's the case, I can still do a reorg ala Section 3 (slight variation with "sandwiching") with 7% stake: Propose a hidden block in n+1, vote for it with my 7% adversarial from that round. Honest block n+2 is uncle to n+1; 93% honest vote for it (because of proposer boost), adversary votes for n+1. Block n+3 is adversarial, builds on n+1. Now the chain of n+1 has 2*7% + 80% = 94% > 93%, so honest guys switch over and vote for n+3, and n+2 is forked out. So it helps, but is not enough to mitigate them entirely. There is a tradeoff between optimizing the weight boost parameter for mitigating balancing attacks or ex ante reorgs. Higher boost is better to prevent balancing attacks, but worse for preventing reorgs. ## Idea 3 - Changing when fork-choice rule considers attestations #### Ignoring everything received in the same slot What if the fork-choice rule temporarily ignored all blocks & attestations that a validator received in the current slot (including attestations from the past)? This makes the problem even worse! The adversary now has a well defined cut off time to target (beginning of slot). Before the adversary had to listen to the honest block release and could only then release the private block/attestation(s). Now the adversary can target the cutoff time (beginning of slot) to split committee 50/50. The only thing where this rule change would help is that with naive broadcasting, the honest proposer of next slot could hear the adversary's block (in which case the honest proposer would simply build on top of it). This should happen roughly half the time. Still. The attack is made easier and works in naive scenarios half the time. **UPDATE** #### Randomizing attestation times Validators use local entropy to determine when they are supposed to attest in a slot. The idea is to make it unpredictable for an adversary such that targeting and splitting committee members in different views becomes unfeasible. Spec'ed out [here](https://github.com/ethereum/consensus-specs/pull/2101) Again, [this does not work](https://github.com/ethereum/consensus-specs/pull/2101#issuecomment-709290392), Vitalik: > Adversary splits the network 50/50 by broadcasting a set of attestations at exactly the time window when the `slot_timing_entropy` is right **in the middle of its probability distribution** ## Idea 4 - Allow reorgs but on hopefully empty blocks > Joachim: Another approach would be to allow for reorgs of the chain, but set things up such that there is nothing to gain from it. We have seen in the sandwich attack that a block gets susceptible to reorgs if blocks preceding it are adversarial and withheld or get few votes (because that means there might be hidden blocks/votes which are later used in the reorg). One idea then could be that a proposal may only carry tx content, if it can at the same time show that the preceding, say, 3 blocks have received at least, say, 80% of committee votes. Otherwise the proposal has to remain empty. As a result, only blocks that are proposed under somewhat predictable and stable conditions, may contain txs. Other blocks remain empty. They may subsequently be reorged, but we don't care, because nobody can gain from that (because they do not contain high MEV/fees that can be "stolen"). So the incentive problem is gone. > Caspar: One thing that makes me wary is the fact that 20% of stake doesn’t even need to be adversarial for the chain to not grow (usefully, i.e. blocks with txs). It could just drop offline for whatever reason. Admittedly this is unlikely, but still. My gut feeling does not like the fact that this enshrines a hard threshold for “liveness” (again read “usefull progress”). The fact that the chain continues to finalize means there is no inactivity leak mode either to quickly regain control. But as you said already, lowering the threshold makes it both less probable and ensures useful progress over a long enough time frame since if more than 1/3 of stake is offline the chain can’t finalize, in which case after 4 epochs the inactivity leak mode kicks in. > Caspar: So here is how I understand your idea. We are using the usual sandwich attack as a baseline scenario: Propose a hidden block in n+1, vote for it with my 7% adversarial from that round. Honest block n+2 is uncle to n+1; 93% honest vote for it (because of proposer boost), adversary votes for n+1. Block n+3 is adversarial, builds on n+1. Now the chain of n+1 has 2*7% + 80% = 94% > 93%, so honest guys switch over and vote for n+3, and n+2 is forked out. **What changes?** We’re using a metric that measures attestations voting for blocks proposed in the same slot. In the diagram I called it participation rate p. For blocks 1, 2 and 3 everything runs smoothly and p=1 respectively. In slot 4, however, the adversary’s block/attestations are private and the honest validators attest to a block from an older slot (block 3) -> participation rate appears to be zero, p=0. So in slot 5, the average participation rate over the last 3 blocks drops to 0.67, which is below the defined threshold of 0.8. Hence, block 5 must be kept empty. Participation rate in slot 5 appears to be p=0.93. So in slot 6, the average participation rate over the last 3 blocks drops even further to around 0.64. Again, block 6 must be kept empty. As a result, we observe block 4 is still full, but blocks 5 and 6 are now empty. While this makes it worse for the adversary, the adversary can still extract MEV! How? Block 4 can essentially be built in slot 6 when block 6 is released. In other words, the attack buys time. Instead of having 12s listening time, the adversary now has 36s listening time. More listening time, more MEV to extract. It’s not as clear cut anymore, because you lose out on tx that you could include in block 6, but you can still put the high value txs into block 4 instead. Yes, it’s better, but is it good enough to justify increased complexity and liveness issues? I don’t think counting any attestations are sufficient, you actually need to count the attestation that are voting on a same-slot block… In other words, if latency is bad and voting on same-slot blocks is not feasible then there is no progress. So if latency > `SECONDS_PER_SLOT` no useful progress right (arguably even `SECONDS_PER_SLOT / 3`)? ![](https://storage.googleapis.com/ethereum-hackmd/upload_9aa0173033132f07fa96a2cf23eaaadd.jpg)

Read more

Ethereum resources

Proposer Boost considerations

(Un-) Timeliness in PoS Ethereum

Flashbot's architecture