-
-
owned this note
-
Published
Linked with GitHub
# Mainnet 2MB data experiment
**Goal**: to test sustained ~max block size (roughly 2MB) distribution on mainnet
**Method**: abuse CALLDATA in a number of "spam" transactions to create roughly max-sized blocks given the current 30M gas-limit. Ideally these are sequential, testing a build-up in load. Additionally, the CALLDATA should be generally random, thus uncompressible.
### Data collection
* Sentry nodes
* Initial receiving time for each test block
* Initial receiving time for each associated individual attestation (mapped to validator) and note missing (subscribe to all subnets)
* Initial receiving time for each aggregate
* Import time for blocks
* Bandwidth usage changes
* Log `forkChoiceUpdated` message request/response times on the Engine API port
* Tells us the time it takes for the CL to issue a call and for the EL to fetch the block and validate it
* Attestation propagation time
* Chain data
* Blocks making it into canonical chain
* Proposal delays & subsequent late proposals/blocks
* Attestations making it into canonical chain
* correctness (head, source, target)
* inclusion delay
* Validator Sync Committee performance
**Re sentry nodes**:
* Prysm node can be configured to output the requisite data. Should check if we have at least one more client that can get the same data.
* Ideally run (or use other existing nodes) across many regions and many bandwidth contraints (5Mb, 10Mb, 25Mb, 100Mb, 1Gb / s)
* If possible, compare full attnet nodes against 1 to 2 attnet nodes (in case drastically changes mesh connectedness)
* [List of latency-related metrics for various CL clients](https://hackmd.io/@dapplion/rJWMXd98j)
### Analysis
For a given block size at a non-aribitrary sustain rate (e.g. 5 to 10 slots in a row) how much do our key indicators begin to degrade? -- both p2p arrivals and on-chain data. This requires pulling data of a *baseline* as well as during the experiment.
#### Chaindata analysis
Chaindata is a valuable proxy for how efficiently/effectively messages are being delivered by their expected times. The below questions should be parsed from the chain data. We do not have a particular degregation predefined that we would be willing to accept -- ideally none but this experiment will help inform conversations and debates of block size.
* Does orphan rate increase?
* Does attestation inclusion delay increase?
* Random validators or conistent?
* Does head correctness decrease?
* Random validators or consistent?
* Does attestation packing efficiency degrade?
* Sync committee performance degregation (random or consistent)?
#### Sentry node data analysis
Success is generally defined as some degredation but receiving of a message still generally falling within expected slot sub-boundaries. The general idea is to pick a 4844 value at or below a successful mainnet experiment value.
Consistent vs random degredation of validator indices helps indicate if this is a network issue or a particular hardware/setup issue for the sender.
Differences in view between sentry nodes help indicate assymetries in network delivery for different geographies and resources (useful for understanding impact of message delivery on *user* nodes as well as validators).
* Blocks:
* What is the difference in first block receive time for sentry nodes? Does it begin to exceed >3s into slot?
* Attestations:
* What is the aggregate difference on first attestation receive time? Does it begin to exceed >7s (broadcast time + 3s) into slot?
* Are there consistent validators with degraded performance or is it more random?
* Aggregates:
* What is the aggregate difference in aggregate receive times? Does it begin to exceed >11s into slot?
* Consistent validators or random in inclusion problems?
### Open Questions
#### Data collection
* Where to send data? How to best visualize across all clients types and node operator setups?
* EF Devops to spin up a common Prometheus instance that everyone can write logs to?
#### Transactions
* How best to "fill" blocks? 30m gas transactions? Multiple "big" transactions? Is there a cap on the transaction size for inclusion and/or gossip? How to best submit transactions? Public mempool? mev-boost? Special nonce ordering?
* For v1 will want to ramp up block size anyways, so using larger and larger sets of 128kb transactions is probably sufficient. Can submit either via public mempool or mev-boost.
**Similar Experiment**: Starkware ran a [similar experiment](https://ethereum-magicians.org/t/eip-2028-transaction-data-gas-cost-reduction/3280/35) in the context of EIP-2028 to assess the effect of blocks with large amounts of CALLDATA on the network before lowering CALLDATA gas price. This is an attempt to run a similar experiment with much more data gathering.