-
-
Published
Linked with GitHub
# A state expiry and statelessness roadmap
The Ethereum state size is growing quickly. It is currently around 35 GB for just the state and over 100 GB including all Merkle proofs, and is increasing by roughly half that amount per year. State storage is additionally a weak point in Ethereum's economics: it is the only mechanism by which a participant can pay a cost once to burden consensus nodes forever. In order to maintain the scalability and sustainability of Ethereum, we need some solution.
Two paths to a solution exist, and have existed for a long time: **weak statelessness** and **state expiry**:
* **State expiry:** remove state that has not been recently accessed from the state (think: accessed in the last year), and require witnesses to revive expired state. This would reduce the state that everyone needs to store to a flat ~20-50 GB.
* **Weak statelessness:** only require block proposers to store state, and allow all other nodes to verify blocks statelessly. Implementing this in practice requires a switch to [Verkle trees](https://notes.ethereum.org/@ZuSZK8r2TgO7eFShwj4hVg/H1XE_w30w) to reduce witness sizes.
This document describes a **multi-stage proposal to implement both of these ideas at the same time**. As it turns out, this is significantly _easier_ than doing them in series in either order. State expiry without Verkle trees requires very large witness sizes for proving old state, and switching to Verkle trees without state expiry requires an in-place transition procedure (eg. [EIP 2584](https://eips.ethereum.org/EIPS/eip-2584)) that is almost as complicated as just implementing state expiry. If one at the same time, however, the two reforms solve the challenges of each other: state expiry involves creating a new state tree every year, allowing Verkle trees to be phased in over time without an in-place transition, and Verkle trees solve the issues with witness size.
## Links: history of state expiry and statelessness ideas
* **The Stateless Client Concept**, original ethresear.ch post (2017): https://ethresear.ch/t/the-stateless-client-concept/172 (see also [EthHub](https://docs.ethhub.io/ethereum-roadmap/ethereum-2.0/stateless-clients/))
* **State rent (precursor to state expiry)**, original 2015 proposal: https://github.com/ethereum/EIPs/issues/35
* **ReGenesis** (Alexey Akhunov's proposal, can be described as a form of state expiry + history expiry): https://medium.com/@mandrigin/regenesis-explained-97540f457807
* **Verkle trees**: https://notes.ethereum.org/_N1mutVERDKtqGIEYc-Flw
* **Presentation on bounding witness sizes** (Youtube): https://www.youtube.com/watch?v=qQpvkxKso2E
* **A theory of state size management** (Feb 2021): https://hackmd.io/@vbuterin/state_size_management
* **Resurrection-conflict-minimized state bounding**: https://ethresear.ch/t/resurrection-conflict-minimized-state-bounding-take-2/8739
* **A few paths to statelessness and state expiry**: https://hackmd.io/@vbuterin/state_expiry_paths
## Recap: how does state expiry work?
This is a description of the mechanism proposed [here](https://ethresear.ch/t/resurrection-conflict-minimized-state-bounding-take-2/8739), and what is being proposed in this document. The core idea is that there would be a state tree per period (think: 1 period ~= 1 year), and when a new period begins, an empty state tree is initialized for that period and any state updates go into that tree. All writes that happen during a period go into the latest tree (so new trees and old trees may store the same information or even conflict with each other; newer trees always take precedence).
![](https://storage.googleapis.com/ethereum-hackmd/upload_f3fa7ff1e7aee827391c6e4b77a93179.png)
<br><center><small><i>Note that these roughly-year-old state expiry periods have historically sometimes been called "epochs", but I'm switching to the "period" language to avoid confusion with beacon chain epochs.</i></small></center><br>
Two key principles are maintained:
* **Only the most recent tree (ie. the tree corresponding to the current period) can be modified**. All older trees are no longer modifiable; objects in older trees can only be modified by creating copies of them in newer trees, and these copies supersede the older copies.
* **Full nodes (including block proposers) are expected to only hold the most recent two trees, so only objects in the most recent two trees can be read without a witness**. Reading older objects requires providing witnesses.
A "witness" is a short proof that proves that a value, or some set of values, are at some position in a tree, that can be verified by someone who only has the root of the tree. For example, one could make a witness that proves that storage slot `123` of the account `0x124f...89ab` contains the value `50` in some state, and anyone with the root of that state tree could verify the proof.
State expiry institutes a hybrid state regime: consensus nodes need to store state that was accessed or modified recently, but can use the witness-based stateless client approach to verify state that is older. That said, it is possible to maintain an "archive node" that stores even historical state trees, _or_ a fully stateless node that uses witnesses to verify even recent state. However, the gas cost structure and the default network formats are built around the assumption that nodes are storing the most recent two state trees.
## The roadmap
The transition roadmap is implemented in stages. The stages are:
* **Period 1 hard fork**: we implement a hard fork that begins period 1 (everything before then is period 0). After this fork, there will be two state trees: the hexary Patricia tree (frozen and no longer editable), and a new Verkle tree (containing all new edits/additions to state, as well as a copy of old state that was accessed)
* **Proto-EIP**: https://notes.ethereum.org/@vbuterin/verkle_tree_eip
* **Address period expansion**: addresses are extended from 20 to 32 bytes, and the new address format includes a concept of "address periods" (formerly called "address spaces"). This allows new contracts to fill new storage slots without needing to provide witnesses. This can be done at any point before the final state expiry transition, before or after the period 1 hard fork.
* **VB's proposal**: https://ethereum-magicians.org/t/increasing-address-size-from-20-to-32-bytes/5485
* **Ipsilon team proposal**: https://notes.ethereum.org/@ipsilon/address-space-extension-exploration
* **Period 2 hard fork**: we implement a hard fork that begins period 2, and schedules the beginnings of future periods. The period 0 hexary Patricia tree is replaced with a Verkle tree, and clients only store the root, so state in the period 0 tree now requires witnesses to prove. After this point, the state expiry scheme has been fully implemented.
* **Proto-EIP:** https://notes.ethereum.org/@vbuterin/state_expiry_eip