HackMD
    • Sharing Link copied
    • /edit
    • View mode
      • Edit mode
      • View mode
      • Book mode
      • Slide mode
      Edit mode View mode Book mode Slide mode
    • Note Permission
    • Read
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • Write
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • More (Comment, Invitee)
    • Publishing
    • Commenting Enable
      Disabled Forbidden Owners Signed-in users Everyone
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Invitee
    • No invitee
    • Options
    • Versions and GitHub Sync
    • Transfer ownership
    • Delete this note
    • Template
    • Save as template
    • Insert from template
    • Export
    • Google Drive Export to Google Drive
    • Gist
    • Import
    • Google Drive Import from Google Drive
    • Gist
    • Clipboard
    • Download
    • Markdown
    • HTML
    • Raw HTML
Menu Sharing Help
Menu
Options
Versions and GitHub Sync Transfer ownership Delete this note
Export
Google Drive Export to Google Drive Gist
Import
Google Drive Import from Google Drive Gist Clipboard
Download
Markdown HTML Raw HTML
Back
Sharing
Sharing Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
More (Comment, Invitee)
Publishing
More (Comment, Invitee)
Commenting Enable
Disabled Forbidden Owners Signed-in users Everyone
Permission
Owners
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Invitee
No invitee
   owned this note    owned this note      
Published Linked with GitHub
Like BookmarkBookmarked
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# "Merry Go Round" sync ## Motivation Here we define a sync algorithm with the following goals and properties. ### Bittorrent style swarming The network should be healthy even in the face of a small number of full nodes serving an arbitrarily large number of syncing nodes. A syncing node which is only connected to other syncing nodes should still be highly performant. ### Limit protocol abuse We aim for a protocol that mitigates against un-intended use cases such as a stateless client attempting to dynamically fetch state data to serve an `eth_call` JSON-RPC request. ### Efficiency Efficient bulk transmission of the trie data while maintaining minimal transmission of intermediate tree nodes. ## Terminology ### Sync Epoch We define *Sync Epoch* or more concisely an *Epoch* to be the number of blocks during which a certain part of the state trie is actively being synced. The constant `EPOCH_SIZE` is used to represent this number of blocks. We define *Epoch Boundary* to be any block which satisfies the condition `Block.number % EPOCH_SIZE == 0`. ### Prefix Ranges We treat the main account state trie and all of the contract storage tries as a single tree with a 65 byte keyspace. The first 32 bytes of a key represent the path into the main account state trie. The 33rd byte is to namespace the sub-tries, currently only supporting the state trie. The last 32 bytes of a key represent the path into the contracts storage. ``` ACCOUNT_TRIE_PREFIX{1,32} ACCOUNT_TRIE_KEY{32} [+ 0x00 + STORAGE_TREE_PREFIX{1,32}] ``` We define the term *Prefix* to mean a key which has had some of it's trailing bits truncated. We use prefixes to refer to contiguous ranges of the state trie. A prefix covers a leaf of the tree if its path in the tree begins with the given prefix. TODO: once we have merklized code we will need a mechanism to differentiate between storage and merklized code. Nodes use prefixes to communicate which sections of the tree they have. This is done by constructing the minimal set of prefixes which cover the sections of the tree they have fully synced. TODO: examples of prefixes and what they cover ### Hot Spot The *Hot Spot* is a path in the state trie from where clients will sync the state. A hot spoc is active for the duration of an epoch. The protocol makes no concrete rules about what state should be accessible for a given hot spot, but the general rule is that state which is too distant from the hot spot is not likely to be available. The path for a hot spot is determined from the block hash for the boundary block in a given epoch. We then expand this out to a full 65 byte key with the later 33 bytes being set to zero. ## Synchronization Algorithm Nodes will continually keep each other up to date which what sections of the tree they have fully synced. A fully synced client would transmit an array with the empty string `['']` to denote that they have a full copy of the state trie. A client with an empty database would transmit an empty list to denote that they have no data in their trie. As clients complete new sections of the tree they should occasionally update their connected peers with an update prefix list. At the beginning of each epoch we use the 32 byte hash of the boundary block to dertermine the current hot spot. The full path for the hot spot is defined as the 64 byte key such that the first 32 bytes are denoted by the block hash with the last 32 bytes set to zero. A client would then construct a proof which covers the section of the tree around the hotspot. The client then iterates through their connected peers, transmitting the chunks of the proof that the connected client does not have. TODO: deterministic chunking so that multiple seeders seed the same data. > TODO: This section needs more detail and thought put into the proof production. Specifically, what is the algorithm for constructing a witness for a peer given the set of prefixes they have broadcasted to indicate what parts of the trie they already have. ### Proof Availability ### ## Protocol ### `Announce`: `0x00` - Probably include forkid - List of prefixes (with some reasonable max size) - Maybe include a flag for whether the node is interested in proofs. This would allow a node with an incomplete state database to still act as a seeder. ### `ProofsAvailable`: `0x01` A client serving proofs would advertise the availability of a proof as a 2-tuple of `[state_root, prefix]`. - Probably anchor to a specific block hash (epoch) - Probably specify a prefix for which the advertised proof will cover ### `GetProofs`: `0x02` A client requesting a proof asks for it using a 2-tuple of `[state_root, prefix]` which is subject to a validity check by the server that the `prefix` falls under one of the advertised prefixes from a previous `ProofsAvailable` message. The requested `prefix` may be longer than the advertised prefix to specify an more precise part of the tree. > TODO: In order to allow omission of intermediate tree data that the requester already has, maybe this should have a 3rd parameter that indicates where in the tree the proof should begin. - Request id - Request proofs that were advertised by a `ProofsAvailable` message. ### `Proofs`: `0x03` The response to a `GetProofs` request. Contains one of more complete proofs for the requested proofs. > TODO: Make sure we support the case where a server advertises a very broad prefix, the client requests the whole prefix which is too large to transmit in a single chunk. The server should then construct a depth-first proof from the given prefix that covers at least one leaf. Upon receiving an impartial (but provable) response to their request, the client can then re-request using a new more precise prefix - Request id (from the corresponding `GetProofs` request) - The proofs ## Paths not taken ### Fully push centric protocol Originally, proofs would just be pushed to those that need them. This proves problematic in many cases. The simplest is a node with an empty database connected to multiple nodes with the data for the current hotspot. In a pure push-based model the syncing node would receive duplicate proofs for the same data from their connected peers. By using a pull model, the syncing peer can distribute requests across its connected peers to balance out requests for the current hotspot across multiple nodes. ### Fully pull based protocol The protocol *could* just have `GetProofs` and `Proofs` messages. This proved problematic because it places *implicit* requirements on those serving data. An *overloaded* full node operating in this context would have to either not respond, or issue empty responses to requests for proofs. This could result in clients over requesting data from their peers since they have fewer guarantees about what data a given peer will respond with. Adding the `ProofsAvailable` message allows those serving data to explicitely broadcast what parts of the state they are willing to serve to any given peer. ### Hard restrictions on state access far away from the hotspot The protocol intentionally doesn't put any firm rules on how close the hotspot data must be to be available within the current epoch. This allows the protocol to self-adjust based on the overall size of the tree. For very small chains, the entire tree could be served in a single epoch. For extremely large trees, servers can choose to heavily limit state availability to the data that is very close to the current hotspot. In addition, this allows for clients which are very close to having synced the entire tree to finish syncing even when the hotspot is not close to the data they are missing. This would still require a node with that data to be willing to offer the data. ### No fixed cycle length You may notice that the use of block hashes results in there being no defined cycle length for full coverage of the state. This is intentional. Knowledge of the state size is both imprecise and only available to those with a full copy. Similarly, the size of the state changes over time and across different chains. Additionally, nodes have different amounts of available bandwidth and processing power. These variables mean that we cannot define a number of epochs that will work for any chain size, nor can we use the known state size to derive number of epochs since different clients sync at different rates. The use of block hash to define hot spots and providing the flexibility to servers to decide how much of the state to serve around any given hotspot allows the protocol to scale smoothely for both different sized states, and clients with different amounts of bandwidth and processing power.

Import from clipboard

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lost their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.


Upgrade

All
  • All
  • Team
No template.

Create a template


Upgrade

Delete template

Do you really want to delete this template?

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Sign in via SAML

or

Sign in via GitHub

Help

  • English
  • 中文
  • 日本語

Documents

Tutorials

Book Mode Tutorial

Slide Example

YAML Metadata

Resources

Releases

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions

Versions and GitHub Sync

Sign in to link this note to GitHub Learn more
This note is not linked with GitHub Learn more
 
Add badge Pull Push GitHub Link Settings
Upgrade now

Version named by    

More Less
  • Edit
  • Delete

Note content is identical to the latest version.
Compare with
    Choose a version
    No search result
    Version not found

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub

      Please sign in to GitHub and install the HackMD app on your GitHub repo. Learn more

       Sign in to GitHub

      HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Available push count

      Upgrade

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Upgrade

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully