owned this note
owned this note
Published
Linked with GitHub
# Ethereum State Availability
## Base Concepts and Terminology
### Skip Graphs
https://en.wikipedia.org/wiki/Skip_graph
The skip graph is a distributed data structure that can be used to provide a mechanism in a DHT network for managing total ordering of a set of keys. The nodes in the graph are distributed around the network among the participating nodes, and traversal of the graph operates under similar paradigms as a binary search, allowing network participants to find the nodes adjacent to a specific key very quickly.
### Trie Tiling
The rough concept is to divide the leaves of a trie into contiguous chunks. These chunks are referred to as "tiles".
I refer to the concept of "naive" tiling to mean a tiling scheme under which you can determine which tile a specific piece of data from the trie should be located in without having the full trie available. For example, we could tile the current Hexary Patricia Trie into 16 very large tiles simply by stopping at the first nibble. This would give 16 tiles. The first tile's key would be every leaf who's key begins with `0x0`, the 11th tile would contain all leaves who's keys begin with `0xa`.
Other non-naive tiling schemes might be something like each tile containing a fixed number of accounts. Such a scheme would require you to iterate through the trie from a common starting location, placing each contiguous set of `N` accounts into each tile. This simple approach is problematic because the insertion of a new account into a tile located that the start of the trie would result in all subsequent tiles needing to be updated.
The difficulty with tiling is determining a scheme that reduces the number of tiles that need to be updated with each successive block, while still bundling the leaves into manageable chunks.
It is also possible that we don't need an fully agreed upon scheme to still take advantage of tiling.
### Tiling + Skip Graphs
Assuming you have a tiling scheme that works, but potentially isn't "naive", you need a mechanism for traversing the keyspace for each tile to find the tile that contains the data you wish to retrieve. Each tile would have a key `K`. A simple scheme would be for `K` to be the deepest path into the trie which contains all of the accounts for a given tile.
In order to find specific leaf data, we need to know what tile it is contained in. A skip graph of the tile keys allows fast traversal of the tile keyspace to find the tile key which will contain the desired account.
## Accounts vs Contract Storage vs Code
Dealing with the Ethereum "state" means accounting for the following three things.
- Accounts
- `balance`
- `nonce`
- `code_hash`
- `state_root`
- Code
- The actual bytecode referenced by `Account.code_hash`
- Contract Storage
- The actual trie data referenced by `Account.state_root`
Historically, attempts to build a cohesive mechanism to handle all of these has not yielded results. The first piece of intuition that I've formed is that the correct solution might deal with each of these via separate mechanisms.
### Accounts
The account trie is naturally balanced. This property makes it possible for naive participants to apply a sharding scheme which will *probabalistically* result in evenly sized chunks. This has previously been explored under the terminology of "tiling" where we refer to a contiguous "chunk" of the trie as a tile.
Under this approach, each tile would contain `~N` accounts and would be the base unit under which the account data for each account would be stored. Each tile would have one or more accompanying inclusion proofs against recent state roots. Keeping these proofs up to date could be done in a decentralized way by applying block witnesses or in a more centralized manner by relying on an external mechanism to push the updated tiles into the network after each new block.
Previous solutions to this are likely fruitful to pursue.
1. Tiling
2. SkipGraph for tile discovery if a naive approach to tiling cannot be devised.
3. Necessary network throughput to have an outside actor continuously pipe in the latest tiles for the latest state roots.
#### Simple NodeID based Tiling
Here is a simple scheme
Each node on the network would determine their own storage capacity. Nodes would use their `node_id` to determine the path into the account trie under which they will store account data. Then, they would collect the accounts around that path into the trie until they've filled their storage capacity.
Under this scheme, when looking for an account under a given path, you would simply use the FINDNODES mechanism to find the nodes whos `node_id` are closest to the desired trie path.
This scheme is potentially problematic as it is subject to attacks where someone mines node-ids near a certain account in the trie to disrupt access to that account's data. The advertisement approach from alexandria can likely be used to mitigate this by adding a layer of indirection.
### Code
Code is probably the easiest to handle. We can probably host it directly in Alexandria or via an equivalent mechanism where code is deduplicated, and referenced by it's hash (or merkle root).
Code doesn't need to be updated and should be roughly equivalent to how Blocks/Headers/Receipts are store in Alexandria.
Code should be very straight forward.
### Contract Storage
Contract storage remains the complex one. These tries can be very large, and they have no mechanism to force them towards being well balanced. This appears to make naive sharding of these tries unfeasible.
The total size of each contract's storage also varies widely. For many contracts, it may be reasonable to store the whole state in one place. At the far other end of the spectrum there are contracts with storage sizes in the GB range. These will need to be "tiled" in some way, splitting them into smaller chunks.
Contract storage also differs from the main Account trie in that the magnitude and rate of updates is going to be smaller and less frequent. Some contract will have very regular changes but those changes are small. Some contracts will rarely change storage. Rarely will a significant amount of a contract's storage change in a single block.
Self-Destructs must also be dealt with.
The most promising solutions for this will likely be similar to how we deal with Accounts. The unbalanced nature of these tries likely requires a different approach to tiling. Since it is likely that whatever tiling scheme is chosen for contract storage will not be able to be "naive", a Skip Graph can serve a critical role in each contract storage to enable traversal of the tiles.
## Supported Use Cases
We should have a clear definition of what this network is used for.
### Sync
Probably Not. This network isn't optimized for bulk retrieval. It might turn out to be possible, but I don't think it's a use case that has to influence design decisions.
### On-Demand Access
Things like `eth_call` require on-demand access to *random* parts of the state. This is a use case we explicitely suppport.
### Re-Genesis
There is probably a related use case in Re-Genesis. The *cold* state is static, and is needed to construct transactions. Since it is *static*, serving it becomes much easier because you don't need to constantly keep it up-to-date.
## Unsolved Stuff
Here's a dump of the parts of this that still need figuring out.
### Update Rates
We need to look at both the account trie, and the individual storage tries to get an understanding of the update rate.
> "How many accounts are touched how often."
Understanding things like average rates or 99th percentile can be directly mapped to understanding how many "tiles" would need to be updated per block.
### Data Footprint
How big should tiles be. How many total tiles does it take to cover the entire state. How many tiles would each node in the network need to maintain.
### Skip Graph Hardening
The skip graph data structure needs to be better understood. There are complexities that arise in repairing the graph. There is some difficult concept of initializing the graph and needing bootnodes to initiate you into the graph. It's possible for multiple disparate graphs to occur and clients should be able to handle merging (this is probably related to "repair")
### State Root Anchoring
Each "tile" should have a proof anchoring it to a state root. Do tile owners maintain multiple versions of the tile for recent state roots? When a tile doesn't change, the corresponding *proof* still needs to be updated if they are going to support recent state roots. How hard is it for clients to keep their proofs up-to-date.