This note attempts to loosely specify a design for a DHT network that satisfies the use cases outlined in the state network use cases forum post.
The network is based Kademlia.
The network operates on a 256-bit keyspace. Both nodes and content share this keyspace.
A Node is an individual network participant. They are uniquely identified by a node_id
which is a 256-bit value derived from their public key.
Content on the network is identified by a key K
or it’s content identifier content_id
. The content identifier is derived from the key using hash(K)
where hash
could be any agreed upon cryptographically secure hash function.
The null content, aka, the empty bytestring is explicitely forbidden in this network.
The distance function for computing the distance between two nodes is a bitwise or operation.
Each node maintains a routing table with 256 buckets, deliniating the key space by log(distance)
.
TODO: expand on this in more detail or link out to an existing description (discv5?)
Each node maintains information about where content can be found on the network. These advertisements may be specific keys, or could use an ARF to advertise contiguous ranges.
Nodes may populate their advertisement table with any data they choose.
By convention nodes populate their advertisement table with advertisements for content whos content_id
is close to their node_id
. Recommended client design is to maintain three of these tables.
node_id
, evicting the content that is furthest away when the storage limit is reached.TODO: Advertisements need some sort of TTL.
TODO: Conventions around verifying content is indeed available at the advertised location.
TODO: Analysis of what advertisements would look like for different storage profiles. Broadcasting these advertisements for large amounts of content will require a prohibitively large number of advertisement messages if the advertisement system is not optimized (such as with ARF).
The network operates on the following conventions.
Nodes with a node_id
that is close to the content_id
for a piece of content are expected to know where that content can be found. Those nodes are not responsible for hosting the content.
The network makes no assumptions about what content is hosted by what nodes.
Conventions may make it more likely that content is hosted by the nodes which a node_id
close to the content’s content_id
, but this cannot be assumed, and will often not be the case.
By convention nodes would likely maintain three stores of content.
node_id
, evicting the content that is furthest away when the storage limit is reached.By convention, nodes would allocate a fixed size memory store that they populate with advertised content which is near their location in the network.
They keyspace of the network is intended allow nodes to assign semantic meaning to sections of the keyspace. This has two main implications.
There is extra-protocol validation that may be necessary for nodes to implement. For example, if block headers are stored under the key block-header/<header-hash>
, then a node on the network should reject storing content that does not validate against the <header-hash>
that is implied by the key.
The network itself makes no assumptions about what key/content
pairs are valid, however, for most, if not all of the content we intend to store, the key will contain additional information that would allow the content to be validated.
A client may only recognize certain ranges of the keyspace. The network must be built under the assumption that some nodes may refuse to support keys outside of supported ranges.
This might mean that we need to give nodes the ability to announce the keyspace they support.
The details of the networking are TBD. Leading options include:
Nodes on the network operate using the following messages.
Ping
request_id
: unsigned 16-bit integerSent by nodes with the expectation of a Pong
message being sent in reply.
Used to measure latency or to check if a node is still online.
Pong
request_id
: unsigned 16-bit integer (from the Ping
message)Sent by nodes in response to a Ping
message.
FindNodes
request_id
: unsigned 16-bit integernode_id
: 256-bit integerdistance
: 8-bit integerSent by nodes to request information about other nodes on the network.
Used to populate and maintain the routing table as well as finding nodes on the network that are near a specific location in the network.
Note: we may need to use the raw key
K
instead of thenode_id
if we choose to use ARF to advertise whole sections of the keyspace.
FoundNodes
request_id
: unsigned 16-bit integer (from the FindNodes
message)nodes
: Array of 256-bit integersNote that in a UDP based network, this message either needs a maximum number of nodes that can be returned or needs to be capable of sending multiple responses with the payload split up across those responses.
Advertise
Sent by nodes to advertise content that they serve. Specifics about how we broadcast advertisements are TBD, but they should take the following into account.
Content likely needs to be advertised by key
so that nodes can choose how they wish to handle it based on the semantic meaning implied by they key.
Locate
request_id
: unsigned 16-bit integerkey
: The key of the content to be located.Sent by nodes to request the node ids of nodes that are known to host the content indicated by the provided key
.
Locations
request_id
: unsigned 16-bit integer (from the Locate
message)nodes
: Array of node IDs that are known to have the content.Sent in response to a Locate
request.
GetContent
request_id
: unsigned 16-bit integerkey
: The key of the content to be located.Sent by nodes to request a piece of content
ContentData
request_id
: unsigned 16-bit integer (from the GetContent
message)data
: raw bytes of the content.Sent in response to a Locate
request. Contains the raw bytes of the content.
Note: If using
UDP
this message will need to be able to be spread across multiple responses to accomodate content sizes that exceed the UDP maximum packet size.
eth
keyspaceAs mentioned above, the keyspace of the network carries semantic meaning. The following keyspaces are defined to support the use cases outlined here
TBD: For each of these keyspaces, gather data on current total size for current mainnet history as well as projected growth rate. Use this to determine total number of keys that need to be stored in the network and the rate new keys will be added. This data will help inform the amount of data each node in the network would need to be responsible for in order for the network to be “healthy”.
<CP>
(short for common-prefix)All keys will begin with a common prefix to establish a namespace for the 1.x network.
The exact common prefix is still TBD but is likely to contain some or all of the following:
eth1
: To denote the “Ethereum” use case and specifically the 1.x use case.version
: To allow “upgrades” to the key semantics used by the network.<CP>/block/<hash>/header
Content can be verified by computing the expected block hash from the content and checking it against the hash from the key.
<CP>/canonical-header/<number>
Content cannot currently be verified. Verification would require a protocol change to add a header acccumulator to the block header.
<CP>/block/<hash>/transactions
<CP>/block/<hash>/transaction/<index>
transaction_root
field.
Note: The “single” transaction key is not strictly necessary. We should evaluate whether it is justified by the agreed upon use cases.
Same as “Block Transactions” but with /uncles/
in the key.
Same as “Block Transactions” but with /receipts/
in the key.
<CP>/canonical-transaction/<hash>
TBD: How would the content be verified and what optimizations can we make to reduce the number of requests needed to fully verify the response data.
TBD
TBD
At this time accounts and contract state are not specified. Further research is needed to understand how this data can be addressed, retrieved, and kept up-to-date efficiently.
Problems that need to be solved here are:
GetNodeData
style with trie hashes is bad for the same reasons that GetNodeData
is bad, enforcing the naive database layout and prohibitive towards flat database layout.Usage profile:
eth_getBalance
, eth_getTransactionCount
, eth_getCode
all exibit behaviors of granular data access. Clients serving these requests would be hitting the network with requests for very small pieces of data.
GetNodeData
. Need to look into what this usage profile looks like.GetNodeData
so we need to look at how requests would be shaped (trie path?)GetNodeData
.Abbreviated to ARF
See: https://ethresear.ch/t/sharding-state-with-discovery-and-adaptive-range-filter-ads/7573
A data structure similar to bloom filters but for checking whether a specific range is included. Potentially useful for a more efficient advertisement mechanism for nodes that serve entire ranges of the keyspace.
A distributed data structure which allows for range queries over the keyspace in a DHT. Potentially useful for efficient retrieval of entire ranges of the keyspace.