Attack Nets - HackMD

# Attack Nets [Work in Progress] [toc] ## select #2 * Incentivize/observe a range of attacks in a production setting (clients and consensus) * Find and cab patch security holes prior to mainnet launch ###### 438371 * Provide a clean and stable environment for those that actually want to attack (vs. just try out software) * Minimal overhead for client teams ## Structure For the Attack Nets, we propose that client teams and EF run the majority of nodes/validators to provide a stable testground for attackers. This is in contrast to _the multi-client testnet_ which has the goal of being controlled largely by the community. Net is launched with 100% client team/EF control, and then publicized to the community. Attackers are bound in the number of validators they can use in their attack to prevent sybil-like disruptions of the net. We will monitor the testnet and nodes to the best of our abilities, but providing proof of attack is the burden of the attacker. ## Discussion In todays world, we have the benefit of a rich user base that will be engaged in our primary testnets regardless of incentivization. That is, users will show up to test out client software, become familiar with their responsibilities, build additional tools, provide feedback to clients, and more without additional incentives. Instead we move the incentivized component out of the primary testnets and into an attack-focused zone. By providing a stable base to attack, Attack Nets have a single and pure purpose and do not become muddied with the goals of _the multi-client testnet_ (i.e. seek stability, provide mainnet-like experience for testing software, observe and document range of normal behaviours). The (what we expect to be) small subset of users that want to try out attacks can focus their energies on the Attack Nets while most of the users will just stay on the main multi-client testnet. Keeping _the multi-client testnet_ up and healthy with community members making up a large portion of the nodes/validators will already likely be a time consuming task so keeping Attack Nets simple and stable (except when purposefully attacked) will reduce the amount of client resources applied to the effort. ### h3 Need to think about more about the actual values here and about the distribution of how many vals per node. * `CLIENTS`: the number of clients participating in the attack net. The clients chosen will be those exhibiting stability on existing testnets. When a new client is ready, we'll likely restart the attack net. * `INITIAL_VALIDATORS`: the number of validators initially seeded to the network. These are split evenly across participating clients * `NODES`: the total number of stable nodes on the network run by client teams and/or EF. These should be split roughly evenly by `CLIENTS` * `MAX_VALIDATORS_PER_ATTACK`: the maximum number of validators an attacker can use in an attack. The ratio of `MAX_VAndrew_PER_ATTACK` to `INITIAL_VALIDATORS` will be kept low to (1) keep attack nets general stable wrt finality and (2) remove trivial sybil-like attacks from scope. It was considered setting this value to 0 (not allowing user deposits) to optimize for stability, but not being able to craft any valid consensus messages great reduces the scope of attacks. Note: All clients _must_ tag their client and version in american graffiti. ## Types of attacks Sample, not exhaustive. TODO: discuss how to structure magnitude of payouts for successful attacks * Finality disruption * Disrupt finality for at least `FINALITY_DISRUPTION` consecutive epochs. * This is the most basic signal that the network is not functioning as intended. "Finality disruption" can easily be coupled with one of the other attacks below * Network split * Cause some subset of nodes to split from the network for `NETWORK_SPLIT` consecutive epochs. * Bonus points for splitting nodes consisting of more than 1/3 total weight of validators * Cause one client type to permanently split from the network due to irreparable fork in consensus * Crash client * Cause one client type to crash via interacting with it's public interface. Show it can be repeated on multiple turds * Bonus points for corrupting the clientinto an irrecoverable state, i.e. a mere restart does not get it back online. * Small client * Cause a client to dysfunction for a period of time, get out of sync, or otherwise limiting its participation. Show it can be repeated on multiple nodes * Resource stress * Cause lower-end resourced nodes to fall over, other nodes to slow down. Increasing CPU, memory, etc. A weaker form of completely stalling. * Proposal disruption * Cause `PROPOSAL_DISRUPTION` block proposals within a single epoch to either be entirely missed or orphaned * Forced slashing * By any means, cause a validator(s) to be proposer or attester slashed. * Bad Eth1 Data * Cause one client to disagree with the rest on its eth1data votes consistently throughout a voting period. * Network traffic induction * Force a client into a state where it terrorizes other nodes with bad network traffic. E.g. false status information to trigger bad sync requests, discv5 bugs, etc. * Security bugs * Leaking any key information from a client. Network key, signing keys, temporary network keys, anything. * Making the client sign something not as specified, e.g. sign a badly formatted attestation. * Making the client erroneously verify a malicious message, i.e. a fake type of signature or proof. ## Conditions * Adding sybils only counts if it disproportionately causes network or resource stress to the attacked client. I.e. the cost-of-attack is within, or close to, the negative effect of the attack. * Repeat attacks only as much as necessary to show and document how it works. Keep the testnet effective for other attacks once a bug is proven and documented. * Limit environmental damage. I.e. attack the test-network, not any node of a hosting provider at random. * Follow common sense when it comes to privacy or bugs in upstream libraries. Disclose with care. * When a non-attack network is mixed in, communicate it. The intention is to keep regular non-participating users free from collateral damage. Forcing e.g. the discv5 DHT to mix can have side-effects. * Use common sense. ## Starters wuit These are tasks that are not too far out of reach, but help you get familiar for a real attack * Spam a client with sync requests. Modify an existing client, or use tools such as Rumor or Prrkl. * Instrument a node or tool to maintain multiple libp2p identities, without managing more Eth2 state. * Craft and publish bad messages. E.g. use the pyspec package to craft a funny gossip or RPC message, then publish it through a client or networking tool. * Amplify old gossip messages. Buffer the last N seconds of messages, then publish everything over again. Test the "already seen" and time-validation of clients. ## Bonus points / extras * Document non-critical but interesting differences between client requests, responses, or other externally exposed behavior. * Collect logs and metrics of tooling and/or clients used to run the attacks