-
-
Published
Linked with GitHub
# Preventing Eth2 Validator Failure
<center>
Aditya Asgaonkar, Carl Beekhuizen
Ethereum Research
</center>
---
[toc]
## Modes of Failure
- **Beacon Node**:
- Liveness:
- Suggest attestation/block proposal on non-canonical chain
- Go offline
- **Validator Client**:
- Safety:
- Key Safety: Validator key compromise
- Slashing Safety: Producing slashable attestations/blocks
- Liveness:
- Go offline
## Preventing Failure
Different types of failures are prevented in different ways:
- Liveness: Prevent through redundancy in associated components
- Safety:
- Key Safety: Secret-share the validator key across separate instances
- Slashing Safety: Simple enough for Phase 0 -- no-slash logic at VC instances. [Things get complicated for Phase 1+](https://github.com/ethereum/eth2.0-specs/issues/1969), and possibly warrants fundamental changes in the BN/VC architecture. Out-of-scope for this document.
Preventing both types simultaneously requires a Byzantine Agreement protocol (or stronger) for the redundant/secret-shared instances to agree on what attestations/blocks to produce. It's very important to identify the requirements of tolerating failures of various types before building a suitable protocol.
## Proposals for SSV protocols
### Type 1
**Objective:**
Protect against VC safety, liveness, & key-safety failures
**Protocol:**
The setup is a single BN instance and multiple SSVCs
1. BN sends `msg` to all SSVCs
2. SSVC signs if `msg` passes the no-slashing check
3. BN receives signature shares from SSVCs, recovers threshold signature, and gossips on p2p.
**Notes:**
- Suitable for running SSVCs on low-powered devices
- BN failure is catastrophic for liveness
- Key safety is "free" as comes from the threshold key mechanism
- Byzantine VCs have no effect on safety, so only crash faults need be considered. ∴ an arbitrary threshold can be set with the desired safety-liveness tradeoff.
### Type 2
**Objective:**
Protect against all types of BN and VC failures
#### Protocol:
The setup is a number of BNs and a number of VCs which all run the following protocol:
1. Agreement within BN nodes:
- Reliable broadcast BN BFT with leader change. Similar to [this](https://notes.ethereum.org/@adiasg/ssv-rbb#Protocol-Specification)
2. BNs suggest `msg` to SSVCs:
- All BNs broadcast the agreed upon `msg` to all SSVCs
3. SSVC signature:
- If a "good" `msg` is received from $2\cdot f_{BN} + 1$ BNs, then the SSVC signs `msg` and broadcasts to all BNs
- If not, then the SSVC sends `LEADER_CHANGE` to all BNs
4. Signature aggregation/Leader change in BN
- If $2\cdot f_{SSVC} + 1$ secret-share signed `msg`s are received, then aggregate the signature and gossip on p2p
- If $2\cdot f_{SSVC} + 1$ `LEADER_CHANGE` are received, then change leader in local view and restart protocol
**Notes:**
- The number of BNs and number of VCs depends on the requirement, and need not be the same.
- Running BN & VC instances on different machines can reduce risk from hardware failure.
- BNs having each other as p2p peers aligns the chain-views of correct nodes up to the bounds of network latency.
## Random thoughts
- Swiss cheese security model
- Run one BN/VC from each client.
- Adding 2 network level no-slash device provides same Slashing Safety as 4 SSVCs
- In both cases, 3 no-slash DBs need to be corrupted to produce slashable messages
- This does, however, come with a liveness compromise