-
-
Published
Linked with GitHub
# Verkle State Migration Discussion
*Aug 10, 2023*
### Introduction
On a recent Verkle Implementers Call, we discussed a potential alternative strategy for migrating state from the Merkle Patricia Tree (MPT) to Verkle. The purpose of this document is to aid the discussion by providing an overview of the two proposals now being discussed, their shared characteristics, and (most importantly) the pros/cons of the unique aspects of each approach.
* **Proposal 1: "Read-only Merkle Tree"** (approach currently being implemented)
* **Proposal 2: "Writeable Merkle Tree"** (approach discussed on the most recent call)
Prior to going through the pros/cons of each approach, let’s start with what the two approaches have in common.
**Shared characteristics of the two proposals**
* Two sets of trees: (1) the base set and (2) the overlay set. The base set contains an account tree and all the storage trees (Merkle), while the overlay set consists of a single Verkle tree that is initially empty.
* During the “Sweeping Phase”: for each block, a set number of leaves are migrated from MPT to Verkle. After writing to Verkle, the data is deleted from MPT. This keeps disk space requirements low.
---
## Proposal #1: Merkle Tree is read-only
For more details, see this [writeup](https://notes.ethereum.org/@parithosh/verkle-transition#%E2%80%9COverlay%E2%80%9D-live-conversion-method) from Pari and Guillaume, and this [EthCC talk](https://www.youtube.com/watch?v=F1Ne19Vew6w) from Guillaume
### Summary
* The base MPT is read-only. Any writes end up in the overlay Verkle tree.
* On lookups, you read from Verkle first. If not in Verkle, then read from MPT. If we had to read from MPT, we migrate this data to VKT so further reads will succeed in VKT.
### Pros of this approach
* Sync:
* The advantage of this approach is the simplicity of its sync:
* All new stuff is Verkle sync and all old stuff is snap sync
* The base tree is downloaded using snap sync
* Healing of the overlay tree is done from the block witness
* Since the base tree is read-only, snap sync will download up-to-date data and the healing phase is unnecessary.
* Reorgs:
* Reorgs of the Merkle tree are relatively easy since the tree is read-only. Verkle tree uses the regular method to deal with reorgs.
* Small blocks:
* The proofs packaged in blocks are Verkle only, and therefore much smaller.
### Cons of this approach
* Unfair/invalid gas cost during the transition:
* Gas cost discrepancy between Verkle and MPT mean that either the gas cost is taken as the sum of two reads (which is potentially unfair) or just ignored (which might be an attack vector)
---
## Proposal #2: Writeable Merkle Tree
### Summary
* Depending on the location of the read/writes with regards to the location of the sweep iterator, read/writes either end up in the MPT or the Verkle tree.
* On lookups, the head of the iterator performing the sweep determines which one of the overlay or base tree will be read/written.
### Pros of this approach
* Gas cost during the transition:
* Only one tree at most is read/written to, so there is no “overcharging” gas for reading.
### Cons of this approach
* Sync becomes much more complicated
* Two concurrent syncs: MPT and Verkle tree
* The healing phase for the MPT continues beyond the fork
* The question of packaging Merkle proofs in blocks arises, which will lead to very large blocks.
* Reorgs:
* Two trees have to be reorged instead of one. This is not a huge disadvantage compared to the other method, since the same must happen in case of a reorg through the fork.
* Consensus:
* Because the MPT is updated, the block needs to have two headers, or two half-headers.
---
## Conclusion
The decision on this boils down to choosing between a simpler sync (Proposal #1) or a more clear gas model (Proposal #2). The idea behind Proposal #2 has previously been suggested for [binary trees](https://eips.ethereum.org/EIPS/eip-2584), but was judged too complex with regards to to sync, leading to the simplified “current” proposal. Identifying which of these problems has the most viable solution will likely sway the decision in favor of either method.