nft devnets - HackMD

# nft devnets ## nft-devnet-0 ### Intent: To perform long range non-finality sync tests. We would trigger non-finality by shutting down nodes, then we start them back up after ~1d and observe time to sync back to head as well as time for the network to self-heal.We additionally have some nodes/validators running on actual home internet by core devs. ### Conclusion - We need to spend some time understanding the results from a p2p perspective, especially relating to the topic of fetching batches from peers efficiently ### General setup: - 1M validators, mainnet distribution, keep some keys for devs to run at home - fill it with blobs @ target and txs with eth_calls Client split: ``` cl_split = { 'prysm': 0.35, 'lighthouse': 0.30, 'teku': 0.25, 'lodestar': 0.01, 'nimbus': 0.08, 'grandine': 0.01 } el_split = { 'geth': 0.43, 'nethermind': 0.36, 'ethereumjs': 0.00, 'reth': 0.02, 'besu': 0.16, 'erigon': 0.03, 'nimbusel': 0.00, } ``` ### Experiment #1: Netowrk params: - cloud VMs, Most nodes are 8c/32GB/20mbps down & 10mbps up, geo distributed - ~5% of nodes contain gigabit connections ### Timeline #1: - We started the network with 5% of the nodes having gigabit connections and 95% having 20/10mbit - During the initial phase, we observed ~95% participation and no forking. This phase lasted a few hours and saw no blobs or transactions being sent - On 22nd at around 7PM CEST, we enabled the transaction and blob spammers - Shortly after, the nodes were unable to keep up with DA verification in time for their duties - We saw non-finality occur as a result, with participation dropping down till ~40% - The root cause was deemed to be rate limiting and inability for the network to fetch blobs for DA with most nodes running at 20mbps/10mbps connections - Around 23rd 00:01AM, we started removing the network bandwidth constraints to check if all nodes would recover - Over the next hours, participation did indeed climb slowly upwards. However during non-finality, prysm was mostly on a fork and the cause might be a potential peering issue. - With the bandwidth limits being removed, nodes did eventually finalize. Some outliers, such as a few teku and prysm nodes did continue to stay on the fork until restarted, the restart mostly was enough to fix these nodes. ### Takeaway #1: - Mainnet as it is today would likely not survive every node having only 20mbps/10mbps connections with huge geographic distributions - According to some studies (e.g [Probelabs](https://probelab.io/ethereum/discv5/2024-46/#cloud-hosting-rate )), mainnet is made up of ~56% datacenter nodes (presumbily with gigabit internet) as well as 44% home nodes (with home nodes having various network limits) - Our mental model of the Ethereum p2p layer probably needs some reworking, our mental model of a flat network with equal low-bw nodes probably isnt true for our current DA setup - The bottleneck for the client at 20mbps/10mbps seems to be fetching data to validate DA within the slot time limits (perhaps due to peer rate limits or other limitations). At gigabit links however, the DA data fetching bottlenecs seems to be solved. But the node still takes ~20mins to sync to head (the theoretical line speed to download the data would take 20s). The bottleneck for long range sync doesn't seem to be raw bandwidth related(Most nodes were peaking at 100mbps traffic, so they had a 10x bandwidth headroom), but instead seems to be batch verification/peering/rate limiting/peer download limit related and we might need to narrow this down in future tests. - We need to have bandwidth contraints on more feature devnets as well, probably as a default on our stack based on mainnet measurements ### Next steps: - We will set ~20% of nodes to contain bandwidth limits, the limit is set to 100mbps/20mbps. The other 80% of nodes continue to be gigabit. - We will take down 33% of the network at random, allow non-finality for a duration and then start the nodes back up and observe sync time. ### Experiment #2: Modifications from expt #1: - Modified the network parameters to have 80% of nodes with 1000mbps/1000mbps down/up. 20% of nodes (chosen at random) have 100mbps down & 20mbps up. - Updated [ethereum-metrics-exporter](https://github.com/ethpandaops/ethereum-metrics-exporter/releases/tag/v0.26.0) with a PR to disable event stream subscriptions by default - Added `NODE_OPTIONS: "--max-old-space-size=16384"` to lodestar - Added `--Xp2p-sync-batch-size=5 --Xp2p-sync-max-pending-batches=50` to Teku ### Timeline #2: - With the above changes, we had ~1d of finality and regular functioning of the network. i.e, a network with 80% gigabit connections works without issues compared to the issues we saw with 5% gigabit nodes. - At epoch 458, ~33% (176) nodes were taken offline. Leading to a participation rate of ~63% and non-finality period starting (This is ~5PM CEST). - At epoch ~677, the nodes were slowly brought back online (not all at once, due to the time it takes to run the ansible playbook). - At epoch ~688 we started seeing finality again, at roughly ~5:10PM CEST (so ~1d of non-finality). - The started up nodes were able to sync (of 1d of non-finality slots) back to head within ~30-40mins as shown in the graph [here](https://grafana.observability.ethpandaops.io/d/MRfYwus7k/nodes?orgId=1&var-consensus_client=prysm&var-execution_client=All&var-network=nft-devnet-0&var-filter=ingress_user%7C%21~%7Csynctest.%2A&viewPanel=34&from=1732549494999&to=1732551889881). Note, the all client dashboard is extremely noisy, but filtering per client using the filter on top gives a better view. ### Takeaway #2: - A network with a healthy set of backbone nodes is able to easily finalize and manage a significant portion of the network going offline/returning - Even during the syncing period, no node was using more than ~160mbps of traffic. This meant the gigabit nodes still had a ~5x of bandwidth as headroom to serve data. - We need to retry scenarios with bandwidth constraints. ### Experiment #3: - Rate limit all nodes to have 100/50 mbps connections and no supernodes. ### Timeline #3: - At ~22:30 CEST the rate limit was applied to all nodes, this corresponds to epoch 736 - The network with rate limits functioned without issues. We decided to introduce non-finality as the next step. - At epoch ~746, about 33% of the network was taken offline. This corresponds to ~23:30 CEST. - At epoch ~847 (~10:30AM CEST), lodestar was updated to use the `nft-devnet-0-231a347` branch and commit - At epoch ~857 (11:30AM CEST), the nodes were brought back online and started to range sync to head ### Takeaway #3: - The nodes were all hitting the bandwidth usage of 50mbps (both up and down). This is despite the fact that the download bandwidth is capped at 100mbps. - Despite the lack of backbone nodes, a flat network with 100/50mbps connections is able to sustain itself with blob and transaction spamming ### Experiment #4: - Rate limit all nodes to have 50/25 mbps connections and no supernodes. ### Timeline #4: - Rate limit was applied at epoch 879 (~13:50 CEST) - Network was stable and finalizing with seemingly no effect on the participation rates - A new rate limit was applied as the next test at epoch ~886 (14:30PM CEST) ### Takeaway #4: - The flat network of 50/25mbps is able to sustain itself with blob and transaction spamming ### Experiment #5: - Rate limit all nodes to have 30/15 mbps connections and no supernodes. - At epoch ~909 (17:00 CEST) we turned off one of each CL/EL (6% of the network in total) to test sync duration - At epoch ~929 (19:00 CEST) we turned on all the nodes again and let them sync back to head ### Takeaway #5: - Applied at epoch ~886 (~14:30 CEST) - The flat network of 30/15mbps is able to sustain itself with blob and transaction spamming - There was a drop in head vote %, pre-rate limit we saw >92% and post rate limit we see between 83%-90% head vote. Total vote dropped by 1% from before the new rate limit. - A few randomly sampled nodes managed to sync up after the 2h of being shut down, they synced up in < 10min and indicated no issues in syncing ### Experiment #6: - Rate limit all nodes to 20/10 mbps connections and no supernodes At epoch ~936 (20:00 CEST) ### Takeaway #6: - We see degraded performance on the network once the rate limit was applied - The network continues to finalize, but it is extremely brittle at these limits