Eth2 hacker start

Want to get started with Eth2 protocol things? This is the place! There are lots of interesting problems to work on, even with only a surface level understanding of Eth2.

All of these challenges are generally good projects to learn more, and start to BUIDL on Eth2.

If you join us at one of the awesome new online hackathons (EthGlobal , Metacartel :eyes:), you can even earn a prize for completing a mini project!

Or, if you just feel like hacking we are always available to help you in the Eth2 Discord!

Eth2 Information

It may be a lot to understand the full Eth2 picture, but learning it on a surface level is already a great start! You can generally get started with challenges after the TLDR.

Eth2 Client Chats

If you are looking for a specific client team or people to help you get into Eth2 with a specific language, try these:

And the Eth2 Discord is the best common place to ask questions and find others working on Eth2 projects!

Join and experiment with a testnet

Phase0 Challenge Ideas

These challenges help you become familiar with different topics of Eth2 Phase0, while building a fun Eth2 app. No better way to learn Phase0, start hacking on things already :100:

Each of the challenges provides some background with useful links to understand the context of the challenge, and what is there for you to build with.
The background info should help a lot, but if a term is not obvious from the context, don’t feel afraid to ask for help!


Explore operation pools and gossip :loudspeaker:

What you learn

Learn how Libp2p GossipSub is used by Eth2 clients, and which information propagates through the Eth2 gossip network.

Background

Eth2 spec about “The Gossip Domain”: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#the-gossip-domain-gossipsub

Nodes communicate “operations” (think; transactions) via GossipSub on the network.

GossipSub spec github Doc: https://github.com/libp2p/specs/tree/master/pubsub/gossipsub

GossipSub paper (with diagrams): https://research.protocol.ai/posts/201912-resnetlab-launch/PL-TechRep-gossipsub-v0.1-Dec30.pdf

Prysm testnet has been hurt by different Gossip and pooling usage bugs;

Networking REPL: https://github.com/protolambda/rumor

Challenge: pool viewer

Like a block explorer, but only showing the most recent data: gossiped attestations, blocks and other operations.

To start off, use the networking REPL to join the Prysm testnet and log their blocks. Follow the commands in the README.

GossipSub is available in many different languages, so if you do not want to use the REPL gossip-logging (too easy?), you can use any of the following implementations:

Once you have established an input/log of gossip messages, you can build a dashboard to show them as they come in. Maybe even visualize them; blocks and attestations show how the chain is evolving.

If this works, you can also think of indexing the events: enable others to look up blocks, like a mini block-explorer!

Also take a look at the fork-choice client challenge if you want to work with others. Using the realtime attestation data to derive the chain head would be a very nice feature!


Eth1-data tracking :eyes:

What your learn

Learn how the deposit contract works, and how the Eth1 -> Eth2 transition looks like for a validator. This is a great challenge if you want to learn about Eth2, but want to stay focused on web, dApp, and Eth1 tooling.

Background

The Eth2 phase0 validators get into the system through a deposit from Eth1.
For this to keep functioning after the Eth2 genesis event, the Eth1 chain has to be kept track of.

Block proposers vote on Eth1 data, and with long voting periods the deposit-contract reference is updated, and new deposits are allowed in.

Validator lifecycle: https://notes.ethereum.org/@hww/lifecycle

Deposit contract spec: https://github.com/ethereum/eth2.0-specs/tree/dev/deposit_contract

Prysm testnet deposit contract: goerli.etherscan.io/address/0x4689a3C63CE249355C8a573B5974db21D2d1b8Ef (Goerli testnet)

Challenge

For the more web, Dapp or Eth1 focused developers this definitely is a more accessible challenge. With Eth1 tooling, and some Eth2 tooling from Lodestar or the py-spec, you can read and understand the deposit contract interactions.

Lodestar tooling: https://hackmd.io/@wemeetagain/rJTEOdqPS/%2FCcsWTnvRS_eiLUajr3gi9g

Using py-spec as a library: https://github.com/ethereum/eth2.0-specs/pull/1584

Or py-spec SSZ: github.com/protolambda/remerkleable

Start off by retrieving deposit logs from the Prysm deposit contract, use your favorite Web3 library. (if in doubt, ethers.js and web3.py are great).

Now that you can see the deposits on the Eth1 side, it is time to find them on the Eth2 side.

With the API of Prysm, you can extract blocks with their respective Eth1 votes. Prysm has a public API RPC endpoint (Docs here), if you do not want to try participating in the testnet yourself already.

What you can now try to do is to query beacon-blocks, parse them with the javascript or python tooling (or any SSZ library), and check the contents:

Combine the two deposit information sources, and you can hack together an Eth1-deposit focused monitoring service!


Forkchoice client :fork_and_knife:

What you learn

If you like to get some practical understanding of POS forkchoice in Eth2, and want to give your own approach a try, this is the challenge. You learn how GHOST works, what “LMD” means, and how it connects together into the Eth2 fork-choice.

Background

LMD-GHOST, a.k.a. latest-message-driven greedy-heaviest-observed-subtree. You can find a good explanation here: https://vitalik.ca/general/2018/12/05/cbc_casper.html

Collection of LMD-GHOST implementations for Eth2: https://github.com/protolambda/lmd-ghost/

LMD-GHOST proto-array; optimized design, recently implemented by Lighthouse and Prysmatic. Stateful, simple representation, fast lookups, fast batched propagation of state changes. Python, Rust

Advanced: if you want to dive really deep, take a look at this paper: https://arxiv.org/abs/2003.03052

Challenge

An Eth2 “client”, but not an active participant or dealing with storage, only following the fork-choice. Or in other words, a “readonly client”. Use the network tooling to stay in sync with minimal dependencies!


multi-node setup :computer: :computer:

What you learn

If you have some interest in running in an Eth2 validator node, this is the challenge to get your hands dirty with. Learn how to run a validator node, and how its API works.

Background

Into the rabbit hole of Beacon Node <> Validator client separation…

To run thousands of validators, block production has to be managed, and signatures have to be managed.

Validator API: https://github.com/ethereum/eth2.0-APIs/blob/master/apis/validator/README.md

For a practical example where load-balancing was misunderstood, and things went wrong, see this Prysm testnet event: https://twitter.com/raulitojordan/status/1242176262904430592

When running two validator nodes, you really do not want the nodes to publish conflicting votes!

Challenge

Adapt the validator API to:

You can scale up just the first point, or also implement the second, to challenge yourself and implement redundancy with double-vote protection.

No double voting or it gets slashed!

Why?


State-sync-only client with Python :snake:

What you learn

Like Python? Want to learn more about running the spec, yet prefer the speed of a client like lighthouse? Try this: learn about optimization efforts so far, and utilize tooling to script sync. Combine the best of both worlds. No uncontrollable big client, but as powerful, with just some scripting.

After this challenge you know better how the Phase 0 state-transition function works, and how clients stay in consensus.

Background

For sync, state-transition optimizations are important, as that bottlenecks the speed of the sync.

Eth2 optimizations first started with a Go implementation focused on implementing the spec in an optimized form. This later evolved in ZRNT: github.com/protolambda/zrnt

However, this was done in a time where everyone was on a minimal configuration: a super tiny state!

Testnets have grown to as big as 100.000 validators, and the state does not have the same characteristics anymore.

To transition fast, and sync fast, memory has to be managed efficiently: no copies, no rollbacks. This is where persistent/immutable data-structures are great for!

This concept of “data sharing” is greatly under-utilized currently, but there is a Python library ready for you to use: https://github.com/protolambda/remerkleable

The Eth2 spec (dev branch) recently moved to utilize this, but is otherwise still the same slow naive implementation. (Except for some memoization tricks)

A second take has been made in fast_spec.py, porting back pre-computation optimizations from ZRNT to Python.

Now, with this, and some Py-libp2p or Pyrum you fetch blocks from the network, and keep a local state in sync, without client!

Challenge

Keep in sync with a small/medium testnet, with minimal code!

The syntax is plain python and nice and readable, but with the right algorithmic optimizations and data-sharing, the transition can be faster than the average Eth2 client!

Start off with the spec, fast-spec, and Pyrum in a python virtual environment:

git clone https://github.com/protolambda/eth2-py-hacks

cd eth2-py-hacks

# Open a venv to install the dependencies in
python3 -m venv venv
. venv/bin/activate

pip install -r requirements.txt

From here you can run functions as defined in the spec to get an idea of the state-transition. See minimal_transition.py for a basic example.

Next you can collect the genesis state of the Lighthouse testnet (v0.10) and extract some blocks with Pyrum (included in example: app.py). You can find all testnet data here: https://github.com/eth2-clients/eth2-testnets/tree/master/lighthouse/testnet5

Load state and blocks with remerkleable (deserialize or from_bytes, provided as class-methods on each spec type), and try a transition.

Once you have a working transition, you can try swapping out parts:

With this put together well, you can be syncing state like a client, in just a few hundred lines of Python :clap:

And, if you like to work with other project ideas:


DOS/eclipse testnets :imp:

Whay you learn

Learn how the Eth2 network looks like, and where we are actively working to harden its weaknesses. By trying to attack a network, you see it in action first-hand, and the results may be useful to inform future Eth2 networking improvements.

Background

Resilience in a p2p network is important. GossipSub, peer-scoring and DOS protections are put in place to resist attacks. However, clients are imperfect and in a public testnet phase. Time to try and play with them!

Things you can look at:

P2p spec: https://github.com/ethereum/eth2.0-specs

Networking REPL: github.com/protolambda/rumor

Python scripting for the REPL: https://github.com/protolambda/pyrum

Lighthouse Denver testnet (realistic, yet small and temporary enough to attack): https://lighthouse-book.sigmaprime.io/become-a-validator.html

Prysm public testnet (realistic, more difficult to attack, but chaotic fun…): https://prylabs.net/

Challenge

Also think of fun inputs or outputs:


Eth2 client fishtank :tropical_fish:

What you learn

Learn how to run different Eth2 clients, and explore interopability challenges.

Background

Sometimes software interactions are fascinating to watch. https://xkcd.com/350/

So what happens when you put clients together anyway?

Clients have been trying to interop for a long time now, with varying success. It all started at the interop lock-in event where everything kind-of talked to each other, but far from stable.

https://twitter.com/preston_vanloon/status/1171943982483554305

https://twitter.com/JonnyRhea/status/1172233598109442049

Since then the client have improved, and some are on the same networking spec.
Lighthouse, Nimbus, Prysm, Teku have a chance at networking together. And the REPL can help debug things this time.

Challenge

Can you put the clients together, and build a fishtank with one or more client types, and nodes running a network?

Tip: start of by pushing a client onto one of the public testnets, and check configuration/fork-version/spec-version.

Public testnets data: https://github.com/eth2-clients/eth2-testnets


Slashing detection :triangular_flag_on_post:

Background

In Eth2, a 100.000, possibly 400.000 validator consensus entities are signing attestations. And each of them is shuffled into a new committee every epoch, and signs some source/target data for FFG consensus.

If you want to keep things simple, think of an attestation as vote, with data (source: int, target: int, ID: bytes32), by a single validator.

For a.ID != b.ID (i.e. two different attestations by the same validator):

These two ways of voting (attesting) are objectively bad, as the 2nd attestation contradicts the 1st and vice versa. This is slashable behavior, which we can catch and enact a slashing for on-chain.

However, for POS to work, the votes need to be attributable and slashed if malicious. To do so, we need to find the offences in this see of activity. Not an easy task!

Some naive numbers:

Now, to improve this. Hypothetical best so far somewhere in the 2 - 3 GB range.

See:

Also, matching speed is important too, as you want to match an attestation as quick as possible to save resources.

Challenge

Read through the proposed optimizations, think of improvements, and try and implement the first slashing-detection. (Prysm has one, but not nearly as efficient yet). If you like you can just start with the min-max span approach, and iterate on that.


Beacon-chain state-sync :arrow_down:

What you learn

Learn how the state in Eth2 is represented as a binary tree, and how communicating branches of this tree can work as simple state-sync proof of concept.

Background

As Eth1x develops other things like Beamsync are being pioneered by clients such as Nethermind and Trinity. Similarly to a light-client, state is fetched on demand, but then gradually fills in a full state.

This can be replicated in Eth2, and more efficiently so! The Beacon-chain provides a structured way to proof batches of state and block-roots, and then maintains a history of batch-roots to compress state.

Additionally, SSZ is a clear binary-tree merkle format, supporting multi-proofs, partials and other fancy light-client toolbox items.

SSZ spec draft: github.com/protolambda/eth2.0-ssz/

Networking ReqResp: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/p2p-interface.md#the-reqresp-domain

SSZ python library, with multi-proof/partials support, backing the current pyspec: https://github.com/protolambda/remerkleable

Challenge

Hack a simple merkle-proof serialization format together, put it on Req/Resp (p2p RPC), and draft a (partial) state sync.

Binary trees work exceptionally well with diffing: find common sub-state, and not sync/repeat that.

And not all state may be required, a “partial” may let you act as if the full thing is there, but only use the parts that are provided in the proof.

Remerkleable has an experimental “virtual node” feature, to lazy-load more sub-state. It could be used to write a slow but fully “beam sync” on-demand state object.

Alternatively, a “merry-go-round” sync approach (inspired by Eth1x sync plans of Piper), would be fun: split the state into N messages, and gossip those on a special “merry-go-round” topic one after another. Then listen on this topic for a whole round of data, and become synced.

The state in Eth2 beacon-chain is relatively tiny (finalized state size is linear with the amount of validators, not the history), so experimenting with sync should be fast and fun.


Open challenge :zap:

I am not here to say what to do, but to help you get into the exciting parts of Phase0! If you have your own ideas for projects, or want to spend more time on spec things, I am happy to help you get set up!


Phase1 Challenges

Phase 1 of Eth2 has a nearly ready specification, but generally client implementers are focused on Phase0, and research on Phase2. If you have research questions about the Custody Game (data-availability in Eth2), shards as data for stateless Phase2, or feel like building something experimental beyond Phase0, let us know!
Phase 1 takes some more niche understanding, but if you are into it and like a challenge we love to help you :clap:


Phase1.5 Challenges

There is an active effort to implement execution on Eth2 before a Phase2 ships. Have a look at this roadmap diagram by Vitalik.

Part of the challenge here is how Eth1 functionalities are bridged into Eth2.
Think of:


Phase2 Challenges

If you like to learn what an “Execution Environment” is, and how running something on Eth2 is going to work, Phase2 is the place! Phase2 is in a research phase, but unlike Phase1 there is lots of stuff to build on and experiment with (Credits to the Quilt team).

Topics of interest:

To learn about Phase2, start by reading this Ethresearch post describing history, progress and challenges.

Then, join the Phase 2 / Quilt Telegram to get looped in with Phase 2 researchers and devs, who can give you starter challenges to hack on. Phase 2 is moving fast, but there are always smaller things you can take on as a beginner.