owned this note
owned this note
Published
Linked with GitHub
# Gas model during the overlay transition
This document presents how database access work during the overlay transition to verkle trees, and why the gas model need not be changed.
## The snapshot / flat database
The adoption of a flat database model is reaching a consensus among clients. Even though not all of them have implemented it yet, we expect to reach complete adoption in the medium term.
Reads from the flat database are rather cheap, compared to tracing the tree from node to node, in order to reach the leaf node that one is looking for.
The reason why the gas for reading isn't getting any cheaper, is that there is no garantee that the snapshot will be present. Typically, a deep reorg will invalidate the snapshot.
## The conversion
In this model, it is assumed that the conversion will use the overlay method with a readonly MPT.
### Snapshot availability
Even though the presence of the snapshot is not required during normal operation, its presence can be guaranteed before the sweeping phase of the overlay method. Noting that:
1. the MPT is frozen, and
2. the conversion will not start for a few days after the fork height, in order to give clients enough time to download the preimages
there is, in fact, enough time to ensure that a snapshot is generated. For geth, snapshot generation takes about 2 hours currently on a fast nvme disk. There is, therefore, ample time between the finalization of the fork block and the start of the snapshot sweeping.
### Updated snapshot structure
The snapshot needs to retain both MPT and verkle values. The MPT values that are "cloberred" by the verkle tree also need to be retained, so that they can be served to syncing nodes.
The snapshot is now designed to hold both MPT and verkle values. It is still keyed by the keccack of the address/slot, but the value is changing as follows:
* if both a MPT and a verkle value exist, the RLP encoding of the MPT value is concatenated with the "raw serialization" of the verkle tree node;
* if only the MPT value is present, the RLP encoding will be stored. The absence of the verkle value can be determined by the fact that the decoder consumed all the bytes of the payload.
* if only one verkle value is present, then the first byte of its encoding will be `1` or `2`, which is well outside of the range for an RLP list header (`0x80..0xff`).
### State reads
In order to read, the snapshot is searched using the traditional key structure: `key = keccak256(address) || keccak256(slot number)`.
* if an entry is not found, then the address / slot is not present in either the MPT nor the verkle tree;
* if an entry is found, then:
* if the first byte is an integer `N < 0x80` then the entry corresponds to an address/slot that was inserted in the verkle tree and was never present in the MPT;
* if the 2nd byte resembles an RLP header, then there was a value in the MPT. Extract the length `L` of that RLP payload.
* if the value byte length matches `L`, then the value is only present in the MPT;
* if the value byte length is greated than `L`, then the extra bytes are the serialization of the verkle value;
* if the value byte length is smaller than `L`, then report an error.
Because both MPT and verkle values can be read in one swoop, and because the I/O dominates the decoding, then there is no need to update the gas cost.
### State writes
Since only the verkle tree is updated, the writing costs in the tree are unchanged during the transition, as all internal nodes need to be read and updated. Updating the snapshot has about the same costs, as both the MPT and verkle values are written together.