Want to get started with Eth2 protocol things? This is the place! There are lots of interesting problems to work on, even with only a surface level understanding of Eth2.
All of these challenges are generally good projects to learn more, and start to BUIDL on Eth2.
If you join us at one of the awesome new online hackathons (EthGlobal , Metacartel ), you can even earn a prize for completing a mini project!
Or, if you just feel like hacking we are always available to help you in the Eth2 Discord!
It may be a lot to understand the full Eth2 picture, but learning it on a surface level is already a great start! You can generally get started with challenges after the TLDR.
If you are looking for a specific client team or people to help you get into Eth2 with a specific language, try these:
And the Eth2 Discord is the best common place to ask questions and find others working on Eth2 projects!
These challenges help you become familiar with different topics of Eth2 Phase0, while building a fun Eth2 app. No better way to learn Phase0, start hacking on things already
Each of the challenges provides some background with useful links to understand the context of the challenge, and what is there for you to build with.
The background info should help a lot, but if a term is not obvious from the context, don’t feel afraid to ask for help!
Learn how Libp2p GossipSub is used by Eth2 clients, and which information propagates through the Eth2 gossip network.
Eth2 spec about “The Gossip Domain”: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#the-gossip-domain-gossipsub
Nodes communicate “operations” (think; transactions) via GossipSub on the network.
GossipSub spec github Doc: https://github.com/libp2p/specs/tree/master/pubsub/gossipsub
GossipSub paper (with diagrams): https://research.protocol.ai/posts/201912-resnetlab-launch/PL-TechRep-gossipsub-v0.1-Dec30.pdf
Prysm testnet has been hurt by different Gossip and pooling usage bugs;
Networking REPL: https://github.com/protolambda/rumor
Like a block explorer, but only showing the most recent data: gossiped attestations, blocks and other operations.
To start off, use the networking REPL to join the Prysm testnet and log their blocks. Follow the commands in the README.
GossipSub is available in many different languages, so if you do not want to use the REPL gossip-logging (too easy?), you can use any of the following implementations:
Once you have established an input/log of gossip messages, you can build a dashboard to show them as they come in. Maybe even visualize them; blocks and attestations show how the chain is evolving.
If this works, you can also think of indexing the events: enable others to look up blocks, like a mini block-explorer!
Also take a look at the fork-choice client challenge if you want to work with others. Using the realtime attestation data to derive the chain head would be a very nice feature!
Learn how the deposit contract works, and how the Eth1 -> Eth2 transition looks like for a validator. This is a great challenge if you want to learn about Eth2, but want to stay focused on web, dApp, and Eth1 tooling.
The Eth2 phase0 validators get into the system through a deposit from Eth1.
For this to keep functioning after the Eth2 genesis event, the Eth1 chain has to be kept track of.
Block proposers vote on Eth1 data, and with long voting periods the deposit-contract reference is updated, and new deposits are allowed in.
Validator lifecycle: https://notes.ethereum.org/@hww/lifecycle
Deposit contract spec: https://github.com/ethereum/eth2.0-specs/tree/dev/deposit_contract
Prysm testnet deposit contract: goerli.etherscan.io/address/0x4689a3C63CE249355C8a573B5974db21D2d1b8Ef (Goerli testnet)
For the more web, Dapp or Eth1 focused developers this definitely is a more accessible challenge. With Eth1 tooling, and some Eth2 tooling from Lodestar or the py-spec, you can read and understand the deposit contract interactions.
Lodestar tooling: https://hackmd.io/@wemeetagain/rJTEOdqPS/%2FCcsWTnvRS_eiLUajr3gi9g
Using py-spec as a library: https://github.com/ethereum/eth2.0-specs/pull/1584
Or py-spec SSZ: github.com/protolambda/remerkleable
Start off by retrieving deposit logs from the Prysm deposit contract, use your favorite Web3 library. (if in doubt, ethers.js and web3.py are great).
Now that you can see the deposits on the Eth1 side, it is time to find them on the Eth2 side.
With the API of Prysm, you can extract blocks with their respective Eth1 votes. Prysm has a public API RPC endpoint (Docs here), if you do not want to try participating in the testnet yourself already.
What you can now try to do is to query beacon-blocks, parse them with the javascript or python tooling (or any SSZ library), and check the contents:
Combine the two deposit information sources, and you can hack together an Eth1-deposit focused monitoring service!
If you like to get some practical understanding of POS forkchoice in Eth2, and want to give your own approach a try, this is the challenge. You learn how GHOST works, what “LMD” means, and how it connects together into the Eth2 fork-choice.
LMD-GHOST, a.k.a. latest-message-driven greedy-heaviest-observed-subtree. You can find a good explanation here: https://vitalik.ca/general/2018/12/05/cbc_casper.html
Collection of LMD-GHOST implementations for Eth2: https://github.com/protolambda/lmd-ghost/
LMD-GHOST proto-array; optimized design, recently implemented by Lighthouse and Prysmatic. Stateful, simple representation, fast lookups, fast batched propagation of state changes. Python, Rust
Advanced: if you want to dive really deep, take a look at this paper: https://arxiv.org/abs/2003.03052
An Eth2 “client”, but not an active participant or dealing with storage, only following the fork-choice. Or in other words, a “readonly client”. Use the network tooling to stay in sync with minimal dependencies!
If you have some interest in running in an Eth2 validator node, this is the challenge to get your hands dirty with. Learn how to run a validator node, and how its API works.
Into the rabbit hole of Beacon Node <> Validator client separation…
To run thousands of validators, block production has to be managed, and signatures have to be managed.
The validator client notifies a beacon node of its presence, and its validator IDs
The beacon-node tells the validator about its beacon-chain duties; when to propose, committee membership, etc.
The beacon-node produces an unsigned block on validator proposals
The beacon-node produces attestation data to sign
The validator client holds the signing key, and protects itself from making double signatures
Validator API: https://github.com/ethereum/eth2.0-APIs/blob/master/apis/validator/README.md
For a practical example where load-balancing was misunderstood, and things went wrong, see this Prysm testnet event: https://twitter.com/raulitojordan/status/1242176262904430592
When running two validator nodes, you really do not want the nodes to publish conflicting votes!
Adapt the validator API to:
You can scale up just the first point, or also implement the second, to challenge yourself and implement redundancy with double-vote protection.
No double voting or it gets slashed!
Why?
Like Python? Want to learn more about running the spec, yet prefer the speed of a client like lighthouse? Try this: learn about optimization efforts so far, and utilize tooling to script sync. Combine the best of both worlds. No uncontrollable big client, but as powerful, with just some scripting.
After this challenge you know better how the Phase 0 state-transition function works, and how clients stay in consensus.
For sync, state-transition optimizations are important, as that bottlenecks the speed of the sync.
Eth2 optimizations first started with a Go implementation focused on implementing the spec in an optimized form. This later evolved in ZRNT: github.com/protolambda/zrnt
However, this was done in a time where everyone was on a minimal configuration: a super tiny state!
Testnets have grown to as big as 100.000 validators, and the state does not have the same characteristics anymore.
To transition fast, and sync fast, memory has to be managed efficiently: no copies, no rollbacks. This is where persistent/immutable data-structures are great for!
This concept of “data sharing” is greatly under-utilized currently, but there is a Python library ready for you to use: https://github.com/protolambda/remerkleable
The Eth2 spec (dev branch) recently moved to utilize this, but is otherwise still the same slow naive implementation. (Except for some memoization tricks)
A second take has been made in “fast_spec.py
”, porting back pre-computation optimizations from ZRNT to Python.
Now, with this, and some Py-libp2p or Pyrum you fetch blocks from the network, and keep a local state in sync, without client!
Keep in sync with a small/medium testnet, with minimal code!
The syntax is plain python and nice and readable, but with the right algorithmic optimizations and data-sharing, the transition can be faster than the average Eth2 client!
Start off with the spec, fast-spec, and Pyrum in a python virtual environment:
git clone https://github.com/protolambda/eth2-py-hacks
cd eth2-py-hacks
# Open a venv to install the dependencies in
python3 -m venv venv
. venv/bin/activate
pip install -r requirements.txt
From here you can run functions as defined in the spec to get an idea of the state-transition. See minimal_transition.py
for a basic example.
Next you can collect the genesis state of the Lighthouse testnet (v0.10) and extract some blocks with Pyrum (included in example: app.py
). You can find all testnet data here: https://github.com/eth2-clients/eth2-testnets/tree/master/lighthouse/testnet5
Load state and blocks with remerkleable (deserialize
or from_bytes
, provided as class-methods on each spec type), and try a transition.
Once you have a working transition, you can try swapping out parts:
fast_spec.py
more. The epoch-transition can still improve. Eth2-docs diagrams the optimized epoch-transition: https://github.com/protolambda/eth2-docsapp.py
is just a simple loop to fetch blocks and update state.
With this put together well, you can be syncing state like a client, in just a few hundred lines of Python
And, if you like to work with other project ideas:
Learn how the Eth2 network looks like, and where we are actively working to harden its weaknesses. By trying to attack a network, you see it in action first-hand, and the results may be useful to inform future Eth2 networking improvements.
Resilience in a p2p network is important. GossipSub, peer-scoring and DOS protections are put in place to resist attacks. However, clients are imperfect and in a public testnet phase. Time to try and play with them!
Things you can look at:
P2p spec: https://github.com/ethereum/eth2.0-specs
Networking REPL: github.com/protolambda/rumor
Python scripting for the REPL: https://github.com/protolambda/pyrum
Lighthouse Denver testnet (realistic, yet small and temporary enough to attack): https://lighthouse-book.sigmaprime.io/become-a-validator.html
Prysm public testnet (realistic, more difficult to attack, but chaotic fun…): https://prylabs.net/
Also think of fun inputs or outputs:
Learn how to run different Eth2 clients, and explore interopability challenges.
Sometimes software interactions are fascinating to watch. https://xkcd.com/350/
So what happens when you put clients together anyway?
Clients have been trying to interop for a long time now, with varying success. It all started at the interop lock-in event where everything kind-of talked to each other, but far from stable.
https://twitter.com/preston_vanloon/status/1171943982483554305
https://twitter.com/JonnyRhea/status/1172233598109442049
Since then the client have improved, and some are on the same networking spec.
Lighthouse, Nimbus, Prysm, Teku have a chance at networking together. And the REPL can help debug things this time.
Can you put the clients together, and build a fishtank with one or more client types, and nodes running a network?
Tip: start of by pushing a client onto one of the public testnets, and check configuration/fork-version/spec-version.
Public testnets data: https://github.com/eth2-clients/eth2-testnets
In Eth2, a 100.000, possibly 400.000 validator consensus entities are signing attestations. And each of them is shuffled into a new committee every epoch, and signs some source/target data for FFG consensus.
If you want to keep things simple, think of an attestation as vote, with data (source: int, target: int, ID: bytes32)
, by a single validator.
For a.ID != b.ID
(i.e. two different attestations by the same validator):
a.source < b.source && b.target < a.target
a.target == b.target
These two ways of voting (attesting) are objectively bad, as the 2nd attestation contradicts the 1st and vice versa. This is slashable behavior, which we can catch and enact a slashing for on-chain.
However, for POS to work, the votes need to be attributable and slashed if malicious. To do so, we need to find the offences in this see of activity. Not an easy task!
Some naive numbers:
8 months * 30 days * 24 hours * 60 minutes * 60 seconds / 12 second block time / 32 slots = 54000 epochs
worth of voting data3*54000*400000*8 = 518.4 GB
Now, to improve this. Hypothetical best so far somewhere in the 2 - 3 GB
range.
See:
Also, matching speed is important too, as you want to match an attestation as quick as possible to save resources.
Read through the proposed optimizations, think of improvements, and try and implement the first slashing-detection. (Prysm has one, but not nearly as efficient yet). If you like you can just start with the min-max span approach, and iterate on that.
Learn how the state in Eth2 is represented as a binary tree, and how communicating branches of this tree can work as simple state-sync proof of concept.
As Eth1x develops other things like Beamsync are being pioneered by clients such as Nethermind and Trinity. Similarly to a light-client, state is fetched on demand, but then gradually fills in a full state.
This can be replicated in Eth2, and more efficiently so! The Beacon-chain provides a structured way to proof batches of state and block-roots, and then maintains a history of batch-roots to compress state.
Additionally, SSZ is a clear binary-tree merkle format, supporting multi-proofs, partials and other fancy light-client toolbox items.
SSZ spec draft: github.com/protolambda/eth2.0-ssz/
Networking ReqResp: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#the-reqresp-domain
SSZ python library, with multi-proof/partials support, backing the current pyspec: https://github.com/protolambda/remerkleable
Hack a simple merkle-proof serialization format together, put it on Req/Resp (p2p RPC), and draft a (partial) state sync.
Binary trees work exceptionally well with diffing: find common sub-state, and not sync/repeat that.
And not all state may be required, a “partial” may let you act as if the full thing is there, but only use the parts that are provided in the proof.
Remerkleable has an experimental “virtual node” feature, to lazy-load more sub-state. It could be used to write a slow but fully “beam sync” on-demand state object.
Alternatively, a “merry-go-round” sync approach (inspired by Eth1x sync plans of Piper), would be fun: split the state into N messages, and gossip those on a special “merry-go-round” topic one after another. Then listen on this topic for a whole round of data, and become synced.
The state in Eth2 beacon-chain is relatively tiny (finalized state size is linear with the amount of validators, not the history), so experimenting with sync should be fast and fun.
I am not here to say what to do, but to help you get into the exciting parts of Phase0! If you have your own ideas for projects, or want to spend more time on spec things, I am happy to help you get set up!
Phase 1 of Eth2 has a nearly ready specification, but generally client implementers are focused on Phase0, and research on Phase2. If you have research questions about the Custody Game (data-availability in Eth2), shards as data for stateless Phase2, or feel like building something experimental beyond Phase0, let us know!
Phase 1 takes some more niche understanding, but if you are into it and like a challenge we love to help you
There is an active effort to implement execution on Eth2 before a Phase2 ships. Have a look at this roadmap diagram by Vitalik.
Part of the challenge here is how Eth1 functionalities are bridged into Eth2.
Think of:
If you like to learn what an “Execution Environment” is, and how running something on Eth2 is going to work, Phase2 is the place! Phase2 is in a research phase, but unlike Phase1 there is lots of stuff to build on and experiment with (Credits to the Quilt team).
Topics of interest:
To learn about Phase2, start by reading this Ethresearch post describing history, progress and challenges.
Then, join the Phase 2 / Quilt Telegram to get looped in with Phase 2 researchers and devs, who can give you starter challenges to hack on. Phase 2 is moving fast, but there are always smaller things you can take on as a beginner.