# New sharding design with tight beacon and shard block integration Previous data shard construction: Add $n=64$ data shards to the beacon chain. Each slot, $n$ proposers independently propose their shard blobs, which are then confirmed by committees. Once a shard blob is confirmed (this can take several slots), it can be referenced by the execution chain. Here is an alternative proposal: Add a new type of transaction that can contain additional sharded data as calldata. The block is then constructed by a single block builder (which can be different from the proposer using proposer builder separation), and includes normal transactions as well as transactions with sharded calldata. This is efficient because it means that tight integrations between rollups and L1 become possible, and it is expected that this "super-block-builder" strategy will emerge in practice anyway in order to maximize MEV extraction. **Advantages:** * Tighter coupling with the current Eth1 chain -- rollup data is immediately visible to manager contracts * All data is confirmed together with the rollup calls * Do not need to wait until shard blocks are confirmed * Do not need to worry about duplicate shard data, etc. * Easier to reason about for developers * Rollups do not need to implement awkward "shard scanning logic". Data addressed to them can immediately be labeled using L1 transactions * There is no need for a separate PBS for shards * This will reduce the cost of market entry for builders, as they will not have to build up two different systems * It will also mean that MEV from interactions between data and execution can be exploited by the builder risk-free, which means less builder centralization * There is no need for a separate shard transaction payment system * The cost of sharded calldata can be directly charged to the transaction that is using them * We can use all existing Ethereum transaction payment infrastructure as is * Increased bribery resistance as data can be confirmed by larger committees (1/32 of validator set) * No separate shard block confirmation allow better transaction aggregation. We need only one committee per slot (two for PBS) * We can add beacon and execution block immediately to the data availability set * Sets stage for data availability+fraud proof protected light client execution as soon as verkle tries are ready * "Eth1 as a rollup" * Enables even lighter validators+nodes that don't need to run a full Eth1 node * Synchronous calls from L2 to L1 become possible with ZKRollups * No need for applications to commit to shards that they want to use * Total block size can be governed in a very similar way to block gas limit * When the beacon chain becomes fraud provable, superlight clients (data availability only) will be possible. * Current shard blob market is inefficient * We claim 64 independent proposers. However in practice, this is impossible, as this would lead to lots of duplicates being confirmed, which in return will lead to centralization of the market * If one shard requires 10 separate blobs to be confirmed, there is no efficient way to bid for all of them to be included in the same slot (all-or-nothing bid) * Much ligher clients are possible * Clients can do full data availability sampling with only ca. 20 kb/beacon block (1.7 kb/s), whereas ca. 1 MB/beacon block (90 kb/s) was needed in the previous construction. This is because we can immediately provide 2d sampling **Disadvantages** * Increased requirements for builders: * Compute KZG proofs for 32 MB of data in ~1s. This requires a 32-64 core CPU. I think a machine that is capable can be built for <5,000$, proof of concept to be implemented. According to conversations with Dag Arne Osvik, it's likely possible to do this on a GPU much cheaper * Larger minimum bandwidth requirements: Need to be able to send 64 MB of data within builder deadline, and make sure it is distributed in P2P. Probably needs at least 2.5 GBit upstream * Neither of these are crazy requirements for builders and other entry requirements arguably already set a higher bar * More power to the builder as they serve execution + data availability * But with PBS we need other censorship resistance methods anyway. With this system they can be treated as a separate concern. The current suggested model is for the proposer to have the power to force certain transactions to be included (https://notes.ethereum.org/Dh7NaB59TnuUW5545msDJQ) ## Suggested Implementation * New constants ```python MAX_SHARDS = 2**8 # 256 SHARD_SIZE_FIELD_ELEMENTS = 2**12 # 2**12*31 = 126976 bytes MAX_BLOCK_SIZE = SHARD_SIZE_FIELD_ELEMENTS * MAX_SHARDS # 2**20 field elements / 32,505,856 bytes TARGET_BLOCK_SIZE = MAX_BLOCK_SIZE // 2 # EIP1559 for data gas ``` * We introduce a new BlockKZGCommitment that commits to all data using a 2D KZG format. This will commit to the beacon block, the execution payload and the commitments to shard data. ```python class ShardedBeaconBlockCommitment(Container): sharded_commitments: List[KZGCommitment, 2 * MAX_SHARDS] beacon_block_commitments: uint64 # Number of commitments occupied by Beacon block + Execution Payload # Aggregate degree proof for all sharded_commitments degree_proof: KZGCommitment ``` * The `ExecutionPayload` gets new fields for the data gas market ```python class ExecutionPayload(Container): # Execution block header fields parent_hash: Hash32 fee_recipient: ExecutionAddress # 'beneficiary' in the yellow paper state_root: Bytes32 receipt_root: Bytes32 # 'receipts root' in the yellow paper logs_bloom: ByteVector[BYTES_PER_LOGS_BLOOM] random: Bytes32 # 'difficulty' in the yellow paper block_number: uint64 # 'number' in the yellow paper gas_limit: uint64 gas_used: uint64 timestamp: uint64 extra_data: ByteList[MAX_EXTRA_DATA_BYTES] base_fee_per_gas: uint256 # Extra payload fields block_hash: Hash32 # Hash of execution block transactions: List[Transaction, MAX_TRANSACTIONS_PER_PAYLOAD] # NEW fields for data gas market data_gas_limit: uint64 data_gas_used: uint64 data_base_fee_per_gas: uint256 ``` * The `ShardedBeaconBlockCommitment` up to `beacon_block_commitments` is the beacon block (which includes the execution payload), encoded into 31 bytes per field element * The remainder of the data does not have any validity conditions * Add a new transaction type 3, with the following payload specification: ```python @dataclass class TransactionKZGCalldataPayload: chain_id: int = 0 signer_nonce: int = 0 max_priority_fee_per_gas: int = 0 max_fee_per_gas: int = 0 # New data gas max_priority_fee_per_data_gas: int = 0 max_fee_per_data_gas: int = 0 gas_limit: int = 0 destination: int = 0 amount: int = 0 payload: bytes = bytes() # KZG Calldata kzg_calldata_commitments: List[KZGCommitment, MAX_SHARDS] access_list: List[Tuple[int, List[int]]] = field(default_factory=list) signature_y_parity: bool = False signature_r: int = 0 signature_s: int = 0 ``` * For each transaction of type 3, there is a validity condition that requires that all `kzg_calldata_commitments` of the transaction are included in the `sharded_commitments` of the `ShardedBeaconBlockCommitment` \_start}x]_2)$ * Add a new opcode to the EVM, `KZGCALLDATA`, that adds the `kzg_calldata_commitments` to a specified memory location * Verify that the sharded commitments are correctly encoded * For each commitment, there is a degree proof, proving that the degree of the commitment is `< SHARD_SIZE_FIELD_ELEMENTS`, checked via the pairing equation: * $e(\mathrm{degree\_proof}, [1]_2) = e(\mathrm{sharded\_commitment}, [s^{\mathrm{SETUP\_MAX\_DEGREE} - \mathrm{SHARD\_SIZE\_FIELD\_ELEMENTS}}]_2)$ * If there are $2N$ sharded commitments, the first $N$ of them represent data, and the remaining $N$ are a polynomial extension. We need to verify that the $2N$ lie on a polynomial of degree $N-1$. * Standard way: Do an FFT of size $2N$ to verify that the high order $N$ coefficients are $0$. This requires $2N \log 2N$ group multiplications, which is relatively expensive * Cheaper way: use the [barycentric formula](https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html#evaluating-a-polynomial-in-evaluation-form-on-a-point-outside-the-domain) to evaluate the first $N$ commitments on a random point in the field, and do the same for the second half. If both points are the same then they are on the same polynomial of degree $N-1$. This only requires one MSM of size $2N$, which takes time of ca. $2N / \log 2N$ group multiplications. * Add `BLS12_381` opcodes (necessary for any reasonable sharding implementation) ## Sampling * We have up to 256 sharded commitments for the original data, which are extended to up to 512 using polynomial extension * Each sharded commitments is of degree $2^{12}$ and extended to $2^{13}$ field elements for sampling * Let's assume a sample size of $2^{4} = 16$ field elements (512 bytes). Then we are sampling a square of 512 rows and 512 columns for a total of $512*512 = 2^{18} = 262144$ samples per block. Any 50% of these samples are sufficient to reconstruct the whole block. Additionally, any row/column can be reconstructed from 50% of samples, which allows efficient recovery by small actors like at home validators even in the event where a block is intentionally poorly seeded * Each validator is assigned to 32 random columns, which they also have to custody in their next attestation (meaning that the attestation will custody a total of $32\cdot32\cdot512$ samples). This has the nice side effect that each block gets bribing resistance of 1/32 of the full validator set (bribing less than 1/2 of the validators attesting to one block is useless) ## Block Builder Costs Block builders get two new tasks. One is computing the KZG witnesses for the samples, and the other one is seeding the samples in the P2P network. * Let's start with the second one. Given that we have about 128 MB of samples, and we say want to distribute each sample to at least 5 peers for good dispersion, and all of this should happen in ~1s, the builder needs a total upstream of 640 MB/s or 5 GBit/s. This is certainly unlikely for an at-home internet connection, but easy enough in a data center * If people really want to run a builder at home, a service can distribute the samples after they were created, reducing their own upstream requirement to 32 MB/s or 256 MBit/s which is possible in some regions * However, the requirement that builders run in datacenters does not seem particularly limiting (they will most likely do that anyway) * Computing the KZG proofs for all samples. This is going to take ~200-300 seconds using FK20 according to my estimates (the uncertainty comes from the precise complexity of an MSM algorithm, which is unknown). This can be distributed across many machines. In a datacenter this computing power is easily available. * Dag Arne Osvik estimates that the witness computation can be completed in ~1s on a GPU # Sharded mempool ## Rationale The current mempool that most transactions go through on Ethereum does not scale to data transactions. Every node processes the full mempool, but with sharding the capacity of the chain increases to 1.3 MB/s, which is also the lower bound for the expected mempool bandwidth if all transactions go through it. Clearly it makes no sense to shard data transactions on chain while requiring nodes to process the full mempool. We instead need to do data availability sampling on the mempool as well when data transactions are concerned. ### Is a mempool still required? An alternative to sending transactions to a public mempool is to send them to a block builder, which is effectively already happening now with Flashbots. The advantage is that block builders can promise not to front run transactions. If they do, then users can simply switch to another block builder. Solana, as an example, already implements this design and by default only sends transactions to the next proposer. It is therefore quite likely that the mempool will be much smaller in size and not even be required for the chain at all. However, we want to keep a mempool in the design for two reasons: 1. The new [censorship resistance design](https://notes.ethereum.org/Dh7NaB59TnuUW5545msDJQ) by Francesco requires a public mempool This does not contradict the earlier point that most transactions will be sent directly to a builder first. Basically users will mostly send their transactions to builders, and only if they get censored they will escalate to the mempool; the mempool becomes a "censored pool" 2. An Ethereum user with a full node should have a way of sending transactions without configuring a central provider. I find this argument less convincing as the future will largely consist of rollups on L1, and end users using L2s. So end users should very rarely have to use the L1. ## Construction All transactions are broadcastet via a libp2p pubsub channel. For transaction type 3, the `kzg_calldata_commitments` are included, but not the actual underlying data. We construct a second set of 512 pubsub channels that are only for propagating samples of shard data transactions. When someone wants to send a type 3 transaction to the mempool, they send the 512 samples of each `kzg_calldata_commitment` to these sample channels. * Normal nodes subscribe to ~10 randomly selected channels. This is to ensure that these pubsubs are available to nodes that want to subscribe to them * Block proposers sample a random 30 channels, and can thus determine if a data transaction is available or not, in order to be able to include them into the `crList` * Block builders subscribe to at least 256, probably all 512 channels, so that they can reconstruct any data transaction for inclusion into the block