--- eip: XXXX title: Gas costs for code chunk access author: Vitalik Buterin (@vbuterin), [TBD] discussions-to: YYYY status: Draft type: Standards Track category: Core created: 2021-ZZ-WW --- ------------------------- ------------------------- ------------------------- # Note The author of this document now favors this alternative proposal to achieve the same goal: https://notes.ethereum.org/@vbuterin/witness_gas_cost_2 ------------------------- ------------------------- ------------------------- ## Simple Summary Add a gas cost charge for accessing each individual "chunk" (contiguous 31-byte segment) of code, to bound the maximum witness size of a block. ## Motivation There is a strong desire to allow **stateless verification** of blocks. A client should be able to verify the correctness of any individual block without any extra information except for a small file that any state-holding node can generate, called a **witness**, that contains the portion of the state accessed by the block along with proofs of correctness. Stateless verification has important benefits including allowing clients to run in low-disk-space environments, enabling semi-light-client setups where clients trust blocks by default but stand ready to verify any specific block in the case of an alarm, and sharding setups where clients jump between shards frequently and cannot keep up with the state of all shards. However, stateless verification is not practical at present because a worst-case block accesses a very large amount of state and the witness size would be prohibitively large. A large part of these issues either already has been resolved or is expected to be soon resolved. [EIP 2929](./eip-2929.md) increased the cost of random state access to ~1900-2600 gas, reducing the maximum amount of state that can be accessed. A move to Verkle trees will decrease the witness size to a mere ~200 bytes per account or storage slot access. These two steps together limit witness data from storage or account accesses to ~1.5 MB in a 15M gas block (with the exception of possible tree attacks that could increase the worst case to ~3 MB). But there is one major remaining hole: contract code. A contract's code can contain up to 24000 bytes, and so a 2600 gas CALL can add ~24200 bytes to the witness size. This implies a worst-case witness size of over 100 MB. The solution is to move away from storing code as a single monolithic hash, and instead break it up into chunks that can be proven separately; this can be done simultaneously with a move to a Verkle tree. However, to actually bound witness size, we also need to add a gas rule that accounts for these costs. This EIP proposes an [EIP-2929](./eip-2929.md)-like "accessed code chunks" list. ## Specification ### Parameters | Constant | Value | | - | - | | `CHUNK_SIZE` | 31 | | `CHUNK_ACCESS_COST` | 350 | | `ACCESS_LIST_CHUNK_ACCESS_COST` | 330 | ### Changes When executing a transaction, maintain a set `accessed_code_chunks: Set[Tuple[Address, int]]`. When a chunk of code not in the set is accessed, charge `CHUNK_ACCESS_COST` gas, and add that chunk to the set. Only chunks `0 ... (code_size - 1) // CHUNK_SIZE` can be accessed or charged for (for accounts with empty code, that's the empty set); attempts to access code chunks beyond this range are not charged for or added to the `accessed_code_chunks` set. We determine when a "chunk of code is accessed" as follows: * At each step of EVM execution, chunk `PC // CHUNK_SIZE` (where `PC` is the current program counter) of the callee is accessed. In particular, note the following corner cases: * The destination of a `JUMP` (or positively evaluated `JUMPI`) is considered to be accessed, even if the destination is not a jumpdest or is inside pushdata * The destination of a `JUMPI` is not considered to be accessed if the jump conditional is false. * The destination of a jump is not considered to be accessed if the execution gets to the jump opcode but does not have enough gas to pay for the gas cost of executing the `JUMP` opcode (including chunk access cost if the `JUMP` is the first opcode in a not-yet-accessed chunk) * If the current step of EVM execution is a `PUSH{n}`, all chunks `(PC // CHUNK_SIZE) <= chunk_index <= ((PC + n) // CHUNK_SIZE)` of the callee are accessed. * If a nonzero-read-size `CODECOPY` or `EXTCODECOPY` read bytes `x...y` inclusive, all chunks `(x // CHUNK_SIZE) <= chunk_index <= (min(y, code_size - 1) // CHUNK_SIZE)` of the accessed contract are accessed. * Example 1: for a `CODECOPY` with start position 100, read size 50, `code_size = 200`, `x = 100` and `y = 149` * Example 2: for a `CODECOPY` with start position 600, read size 0, no chunks are accessed * Example 3: for a `CODECOPY` with start position 1500, read size 2000, `code_size = 3100`, `x = 1500` and `y = 3099` * `CODESIZE`, `EXTCODESIZE` and `EXTCODEHASH` do NOT access any chunks. We extend EIP 2930 by changing the access list format to allow address records in the access list to have either length 2 (as in EIP 2930) or length 3. If they have length 3, the three items are as follows: 1. The address (as in EIP 2930) 2. A list of storage keys (as in EIP 2930) 3. A list of chunk indices, where each index is a `uint16` in little-endian format (ie. a 2-byte integer, with an appended zero even if `n < 256`) Charge `ACCESS_LIST_CHUNK_ACCESS_COST` for each chunk in the access list, and add those chunks to `accessed_code_chunks`. ## Rationale In the current proposed Verkle tree specification, all "account header" fields and code are stored in a single depth-2 subtree of the state trie. The data overhead for accessing all of this tree is ~400 bytes (~150 bytes for the Merkle branch for the subtree, plus 48 bytes for the sub-root commitment and 48 bytes for each of the four sub-tree commitments). The base gas cost of calling an account is 2600 gas. For a contract call to require the full 400-byte witness, it must access all four code sub-trees and incur the per-chunk cost at least once. This pushes the total witness size to 400 + 34 * 4 = 536 (a chunk is 32 bytes + 2 bytes for the chunk index), with a total gas cost of 2600 + 350 * 4 = 4000; hence, the witness gas cost is ~7.5 gas per byte. Additional chunk accesses add 350 gas for 34 bytes, or ~10.3 gas per byte. This is within the same neighborhood as the gas cost to witness data ratio of other state accesses. Chunk size in this EIP is set to 31 bytes to allow one byte to be prepended to the chunk saved in the Verkle tree to represent the number of bytes at the start of the chunk that are pushdata. ## Backwards Compatibility This EIP will increase the cost of calling some contracts significantly. The access list extension ensures that this does not break backwards compatibility, much like the original introduction of EIP 2930 ensured that EIP 2929 does not break backwards compatibility. [Empirical analysis TBD] ## Security Considerations [TBD] ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).