Sharding-format blob-carrying transactions (simple version)

--- eip: XXXX title: Sharding-format blob-carrying transactions (simple version) author: Vitalik Buterin (@vbuterin), Dankrad Feist (@dankrad) discussions-to: https://example.lol status: Draft type: Standards Track category: Core created: 2022-02-02 --- ## Note **This version is probably deprecated, and the [full version](https://notes.ethereum.org/@vbuterin/blob_transactions) of this EIP is currently favored.** ## Simple Summary Introduce a new transaction format for "blob-carrying transactions" which contain a large amount of data that cannot be accessed by EVM execution, but whose commitment can be accessed. The format is intended to be fully compatible with the future Danksharding spec. ## Motivation Rollups are in the short and medium term, and possibly in the long term, the only trustless scaling solution for Ethereum. Transaction fees on L1 have been very high for months and there is greater urgency in doing anything required to help facilitate an ecosystem-wide move to rollups. Rollups are significantly reducing fees for many Ethereum users: Optimism and Arbitrum frequently provide fees that are ~3-8x lower than the Ethereum base layer itself, and ZK rollups, which have better data compression and can avoid including signatures, have fees ~40-100x lower than the base layer. However, even these fees are too expensive for many users. The long-term solution to the long-term inadequacy of rollups by themselves has always been data sharding, which would add ~16 MB per block of dedicated data space to the chain that rollups could use. However, data sharding will still take a considerable amount of time to finish implementing and deploying. This EIP provides a stop-gap solution until that point by implementing the _transaction format_ that would be used in sharding, but not actually sharding those transactions. Instead, they would simply be part of the beacon block and would need to be downloaded by all consensus nodes (but can be deleted after only a relatively short delay). There would be a reduced cap on the number of these transactions that can be included, corresponding to a target of ~1 MB per block and a limit of ~2 MB. ## Specification ### Parameters | Constant | Value | | - | - | | `SYSTEM_STATE_ADDRESS` | `0x000.....0100`| | `TOTAL_BLOB_TXS_STORAGE_SLOT` | `0` | | `BLOB_TX_TYPE` | `Bytes1(0x05)` | | `CHUNKS_PER_BLOB` | `4096` | | `BLS_MODULUS` | `52435875175126190479447740508185965837690552500527637822603658699938581184513` | | `KZG_SETUP_G2` | `Vector[G2Point, CHUNKS_PER_BLOB]`, contents TBD | | `KZG_SETUP_LAGRANGE` | `Vector[BLSCommitment, CHUNKS_PER_BLOB]`, contents TBD | | `KZG_VERSION` | `Bytes1(0x01)` | | `COMMITMENT_PRECOMPILE_ADDRESS` | TBD | | `COMMITMENT_PRECOMPILE_GAS` | `1800000` | | `POINT_EVALUATION_PRECOMPILE_ADDRESS` | TBD | | `POINT_EVALUATION_PRECOMPILE_GAS` | `50000` | | `LOAD_COMMITMENT` | `0x49` (opcode) | | `MAX_BLOBS_PER_BLOCK` | `16` | | `TARGET_BLOBS_PER_BLOCK` | `8` | | `MAX_BLOBS_PER_TX` | `2` | | `GASPRICE_UPDATE_FRACTION_PER_BLOB` | `64` | | `LONG_TERM_MAX_BLOBS` | `1024` | ### Helpers Converts a blob to its corresponding KZG point: ```python def blob_to_kzg(blob: Vector[BLSFieldElement, CHUNKS_PER_BLOB]) -> KZGCommitment: computed_kzg = bls.Z1 for value, point_kzg in zip(tx.blob, KZG_SETUP_LAGRANGE): assert value < BLS_MODULUS computed_kzg = bls.add( computed_kzg, bls.multiply(point_kzg, value) ) return computed_kzg ``` Converts a KZG point into a versioned commitment: ```python def kzg_to_commitment(kzg: KZGCommitment) -> Bytes32: return KZG_COMMITMENT_VERSION + hash(kzg)[1:] ``` Verifies a KZG evaluation proof: ```python def verify_kzg_proof(polynomial_kzg: KZGCommitment, x: BLSFieldElement, y: BLSFieldElement, quotient_kzg: KZGCommitment): # Verify: P - y = Q * (X - x) X_minus_x = bls.add(KZG_SETUP_G2[1], bls.multiply(bls.G2, BLS_MODULUS - x)) P_minus_y = bls.add(polynomial_kzg, bls.multiply(bls.G1, BLS_MODULUS - y)) return bls.pairing_check([ [P_minus_y, bls.neg(bls.G2)], [quotient_kzg, X_minus_x] ]) ``` Approximates `2 ** (numerator / denominator)`, with the simplest possible approximation that is continuous and has a continuous derivative: ```python def fake_exponential(numerator: int, denominator: int) -> int: cofactor = 2 ** (numerator // denominator) fractional = numerator % denominator return cofactor + ( fractional * cofactor * 2 + (fractional ** 2 * cofactor) // denominator ) // (denominator * 3) ``` ### New transaction type We introduce a new [EIP-2718](https://eips.ethereum.org/EIPS/eip-2718) transaction type, with the format being the single byte `BLOB_TX_TYPE` followed by an SSZ encoding of the `SignedBlobTransaction` container comprising the transaction contents: ```python class SignedBlobTransaction(Container): header: BlobTransaction blobs: List[Vector[BLSFieldElement, CHUNKS_PER_BLOB]] signature: ECDSASignature class BlobTransaction(Container): nonce: uint64 gas: uint64 gasprice: uint256 to: Address # Bytes20 value: uint256 data: Bytes blob_kzgs: List[KZGCommitment, LONG_TERM_MAX_BLOBS] class ECDSASignature(Container): v: uint8 r: uint256 s: uint256 ``` We validate a transaction and compute its origin as follows: ```python def validate_and_get_origin(tx: SignedBlobTransaction): assert len(tx.blobs) == len(tx.header.blob_kzgs) <= MAX_BLOBS_PER_TX # A practical implementation would accelerate this by checking # a single random linear combination of the blobs and kzgs for blob, kzg in zip(tx.blobs, tx.header.blob_kzgs): assert kzg == blob_to_kzg(blob) sig = tx.signature return ecrecover(ssz.hash_tree_root(tx.header), sig.v, sig.r, sig.s) ``` We also check that there are at most `MAX_BLOBS_PER_BLOCK` total blob commitments in a valid block. The network layer and RPC APIs are expected to forget and stop serving the blob contents 30 days (precisely: 30 * 86400 seconds) after the time of the start of the slot of the block in which the transaction is included. When providing the contents of expired blob transactions, the blobs array should be replaced with an empty list. ### Commitment opcode We add an opcode `LOAD_COMMITMENT` which takes as input one stack argument `index`, and returns `kzg_to_commitment(tx.header.blob_kzgs[index])` if the transaction is blob-carrying and `index < len(tx.header.blob_kzgs)`, and otherwise zero. ### Commitment verification precompile Add a precompile at `COMMITMENT_PRECOMPILE_ADDRESS` that checks a blob against a commitment hash. The precompile costs `COMMITMENT_PRECOMPILE_GAS` and executes the following logic: ```python def commitment_verification_precompile(input: Bytes) -> Bytes: # First 32 bytes = expected commitment hash expected_commitment = input[:32] assert expected_commitment[:1] == KZG_COMMITMENT_VERSION # Remaining bytes = list of little-endian data points assert len(input) % 32 == 0 input_points = [ int.from_bytes(input[i:i+32], 'little') for i in range(32, len(input), 32) ] assert kzg_to_commitment(blob_to_kzg(input_points)) == expected_commitment return Bytes([]) ``` Note that this precompile takes versioned commitments as input, allowing it to be future-proof. ### Point evaluation precompile Add a precompile at `POINT_EVALUATION_PRECOMPILE_ADDRESS` that evaluates a proof that a particular blob resolves to a particular value at a point. The precompile costs `POINT_EVALUATION_PRECOMPILE_GAS` and executes the following logic: ```python def point_evaluation_precompile(input: Bytes) -> Bytes: # Verify P(x) = y # hash commitment: first 32 bytes root_commitment = input[:32] # Evaluation point: next 32 bytes x = int.from_bytes(input[32:64], 'little') assert x < BLS_MODULUS # Expected output: next 32 bytes y = int.from_bytes(input[64:96], 'little') assert y < BLS_MODULUS # The remaining data will always be the proof, including in future versions # input kzg point: next 48 bytes data_kzg = input[96:144] assert kzg_to_commitment(data_kzg) == root_commitment # Quotient KZG: next 48 bytes quotient_kzg = input[144:192] assert verify_kzg_proof(data_kzg, x, y, quotient_kzg) return Bytes([]) ``` ### Gas price update rule We propose a simple independent EIP 1559-style targeting rule to compute the gas cost of the transaction. We use the `TOTAL_BLOB_TXS_STORAGE_SLOT` storage slot of the `SYSTEM_STATE_ADDRESS` address to store persistent data needed to compute the cost. Note that unlike existing transaction types, the gas cost is dependent on the pre-state of the block. ```python def get_intrinsic_gas(tx: SignedBlobTransaction, pre_state: ExecState) -> int: return ( 21000 + 16 * len(tx.data) - 12 * len(tx.data.count(0)) + len(tx.blobs) * get_blob_gas(pre_state) ) def get_blob_gas(pre_state: ExecState) -> int: blocks_since_start = get_current_block(pre_state) - FORK_BLKNUM expected_total = blocks_since_start * TARGET_BLOB_TXS_PER_BLOCK actual_total = read_storage( pre_state, SYSTEM_STATE_ADDRESS, TOTAL_BLOB_TXS_STORAGE_SLOT ) if actual_total < expected_total: return 0 else: return fake_exponential( actual_total - expected_total, GASPRICE_UPDATE_FRACTION_PER_BLOB ) ``` We update at the end of a block, as follows: ```python def update_blob_gas(state: ExecState, blob_txs_in_block: int): current_total = read_storage( state, SYSTEM_STATE_ADDRESS, TOTAL_BLOB_TXS_STORAGE_SLOT ) new_total = current_total + blob_txs_in_block write_storage( state, SYSTEM_STATE_ADDRESS, TOTAL_BLOB_TXS_STORAGE_SLOT, new_total ) ``` ## Rationale This EIP introduces blob transactions in the same format in which they are expected to exist in the final sharding specification. This provides temporary scaling relief for rollups by allowing them to scale to 2 MB per slot, with a separate fee market allowing fees to be very low while usage of this system is limited. The core goal of rollup scaling stopgaps is to provide temporary scaling relief, without imposing extra development burdens on rollups to take advantage of this relief. Today, rollups use calldata. In the future, rollups will have no choice but to use sharded data (also called "blobs") because it is much cheaper. Hence, rollups cannot avoid making a large upgrade to how they process data at least once along the way. But what we _can_ do is ensure that rollups need to _only_ do one upgrade. This immediately implies that there are exactly two possibilities for a stopgap: (i) reducing the gas costs of existing calldata, and (ii) bringing forward the format that will be used for sharded data, but not yet actually sharding it. EIP-4488 is a solution of category (i); this is a solution of category (ii). The main tradeoff in designing this EIP is that of implementing more now versus having to implement more later. This EIP attempts to implement the minimum possible amount now, leaving quite a few things unimplemented. The following changes would still remain to be done when sharding is introduced: * Including blob KZGs in the beacon block * Low-degree extending those blobs to enable 2D sampling * Actually sharding the blobs (meaning, using subnets to broadcast blobs and data availability sampling them instead of including them as part of the block) Additionally, this EIP only includes some very basic precompiles for accessing the blob contents: one precompile for verifying full blob contents and another for verifying point evaluations (useful for [linking the blob data to a commitment in another proof system](https://ethresear.ch/t/easy-proof-of-equivalence-between-multiple-polynomial-commitment-schemes-to-the-same-data/8188), needed by ZK rollups). The idea of also adding a precompile for verifying _parts_ of blobs was considered and rejected. This does increase development costs for some kinds of rollups, but it's a tradeoff to make this EIP simpler and more practically implementable. The gas price update rule follows the principles of [multidimensional EIP 1559](https://ethresear.ch/t/multidimensional-eip-1559/11651) with an [exponential pricing rule](https://ethresear.ch/t/make-eip-1559-more-like-an-amm-curve/9082). It also uses state variables rather than block header components to store data for computing updates. A possible extension would be to also replace the current EIP 1559 implementation with a copy of the same system (eg. storage slot 1 would store the basefee, and it would be updated in the same way). ## Backwards Compatibility ### Blob non-accessibility Blob contents are not accessible after 30 days. This is a significant change to the execution API that breaks an implied invariant that exists today: that the results of a history query will not change. ### Mempool issues Blob transactions are unique in that they have a variable intrinsic gas cost. Hence, a transaction that could be included in one block may be invalid for the next. To prevent mempool attacks, we recommend a simple technique: only propagate transactions whose `gas` is at least twice the current minimum. Additionally, blob transactions have a large data size at the mempool layer, which poses a mempool DoS risk, though not an unprecedented one as this also applies to transactions with large amounts of calldata. The risk is that an attacker makes and publishes a series of large blob transactions with fees `f9 > f8 > ... > f1`, where each fee is the 10% minimum increment higher than the previous, and finishes it off with a 21000-gas basic transaction with fee `f10`. Hence, an attacker could impose millions of gas worth of load on the network and only pay 21000 gas worth of fees. We recommend a simple solution: both for blob transactions and for transactions carrying a large amount of calldata, increase the minimum increment for mempool replacement from 1.1x to 2x, decreasing the number of resubmissions an attacker can do at any given fee level by ~7x. ## Test Cases ## Security Considerations This EIP increases the size of the execution payload by a maximum of ~2 MB. This is equal to the theoretical maximum size of a block today (30M gas / 16 gas per calldata byte = 1.875M bytes), and so it will not greatly increase worst-case bandwidth. Post-merge, block times are expected to be static rather than an unpredictable Poisson distribution, giving a guaranteed period of time for large blocks to propagate. The _sustained_ load of this EIP is lower than alternatives (eg. EIP-4488 and EIP-4490), because there is no existing software that stores the blobs and there is no expectation that they need to be stored for as long as the rest of the data. Implementing a policy of deleting these blobs after eg. 30-60 days (a much shorter delay than EIP-4444 history expiry period) would require some special-case logic for blobs but it would not break other applications. ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).