A set-theoretic view of Ethereum coteries

# A set-theoretic view of Ethereum coteries <img src=https://storage.googleapis.com/ethereum-hackmd/upload_7dfc3edb714c0fc5f7fa07ca77960ff2.jpg width=49%> 5 USDC if you know who ^ is without googling it $\cdot$ *by [mike](https://twitter.com/mikeneuder)* *friday – october 20, 2023* $\cdot$ ***Acknowledgements*** *Special thanks to [Data Always](https://twitter.com/Data_Always), [Tim](https://twitter.com/TimBeiko), [Justin](https://twitter.com/drakefjustin), [Barnabé](https://twitter.com/barnabemonnot), [Thomas](https://twitter.com/soispoke), [stokes](https://twitter.com/ralexstokes), [Vitalik](https://twitter.com/vitalikbuterin), [Danny](https://twitter.com/drjrayn), [Izzy](https://twitter.com/isdrsp), [Davide](https://twitter.com/DavideCrapis), [Toni](https://twitter.com/nero_eth), & [Dankrad](https://twitter.com/dankrad) for discussions and comments! :-)* $\cdot$ ***tl;dr;*** *Many different entities compose Ethereum's consensus layer. In a recent [Bankless episode](https://www.youtube.com/watch?v=2a2owGBkfhQ), Danny presented a framework of four groups: app-layer users, holders, stakers, and node operators. He claimed that a healthy ecosystem has a clear distinction between each of these groups and that none of them are controlled by a single actor.* *We explore this mental model further by presenting each group as a set and considering different relations between the sets. We begin by establishing the preliminary character list and notation, which we use to present four observations (that follow directly from definitions) and four desiderata (that are not guaranteed but are nice-to-haves). With this groundwork, we describe four "simplified" cases (labeled `Cases 1-4`) and four "extended" cases (labeled `Cases 5-8`) to unpack the structure of the protocol.* *The simplified cases deal with the relative sizes of the base groups, whereas the extended cases focus on more targeted situations: (a) the difference between decentralized staking pools and centralized staking providers, (b) the implications of minimum viable issuance, $c$ the concerns around an application that grows "too big to fail", and (d) how restaking introduces another layer into the protocol. This article raises more questions than it answers, but the mental model seems useful as we consider the design space of the consensus layer.* $\cdot$ ### Related work | Article | Description| |---|---| |[*Minimum viable issuance*](https://notes.ethereum.org/@anderselowsson/MinimumViableIssuance) | Anders' mega-thread on MVI | | [*How Lido Threatens Ethereum*](https://www.youtube.com/watch?v=2a2owGBkfhQ) | Danny's Bankless episode | |[*Concerns around centralization of stake*](https://hackmd.io/@Izzy-/EthereumStakingCodex#Concerns-around-centralization-of-stake)| Izzy's analysis of hyptothetical distributions |  $\cdot$ ***Acronyms*** | source | expansion | |--- | ---| |`NOs` | node operator(s) | |`UASF` | user-activated soft fork | |`CEX` | centralized exchange | ---   ## Preliminaries Before diving in, let's lay a bit of the groundwork. In this article, we describe each group of network participants as belonging to a set. Danny's recent [*Bankless episode*](https://www.youtube.com/watch?v=2a2owGBkfhQ) inspired this framing and it seems worth expanding. To begin, we consider the following groups. ### Participants - $\text{Users}$ – *Anyone who financially engages with the Ethereum network. This includes rollup users, stablecoin holders, NFT purchasers, etc. For this work, we don't focus on "read-only" participants of the network (e.g., data consumers).* - $\text{Holders}$ – *Users who hold `ETH` the asset beyond the amount needed to pay gas. These participants own the existing `ETH` supply. Note that we don't distinguish between custodial and non-custodial holders here; that is a can of worms for a different article.* - $\text{Stakers}$ – *Holders who participate in the consensus layer. We further distinguish them in the following way:* - $\text{NOs}$ (node operators) – any service provider that stakes on behalf of `ETH` holders (we do not initially distinguish between centralized providers – e.g., Coinbase, trusted operator sets – e.g., Lido, and bonded permissionless operators – e.g., RocketPool... more on this later). If we refer to a specific node operator we label them as $\text{NO1}, \;\text{NO2},$ etc. - $\text{Solo}$ – any individual who stakes their own capital. > **Note:** *We don't include the set of people who run nodes but don't stake.* ### Notation - Given the sets $A \;\&\; B$, $A \gg B \implies$ "$A$ is much larger than $B$." - Given the sets $A \;\&\; B$, $A \geq B \implies$ "$A$ is larger than or equal to $B$." - Given the sets $A \;\&\; B$, $A > B \implies$ "$A$ is larger than $B$." - Given the sets $A \;\&\; B$, $A \approx B \implies$ "$A$ is approximately the same size as $B$." - $\forall \implies$ "for all" (standard notation). - $\in \implies$ "in" (standard notation). It would be more accurate to talk about the cardinalities of these sets, but since this is not a very formal piece, I am just going to abuse this notation for readability's sake and to avoid writing $|A|$ everywhere. Throughout the article, we mainly define the cardinality of each set as the "number of unique individuals" who constitute each group. When comparing node operators and solo stakers, we use the "consensus layer size" of each, implying that a single node operator could appear larger than the entire set of solo stakers because they control more validators than all the solo stakers combined. Lastly, we consider node operators as holders in that they represent `ETH` delegated through them, but we acknowledge that the node operators do not own all the `ETH` that they stake. ### Grounding observations Given these definitions, it's useful to explicitly note a few relationships that directly follow. ***Observation 1;*** $\text{Users} \geq \text{Holders}$ *Since each holder is a user (even if their only use is to hold `ETH`), this relation is always true. We could also write this as $\text{Users} \supseteq \text{Holders}$ (i.e., users are a superset of holders).* ***Observation 2;*** $\text{Holders} \geq \text{Stakers}$ *Since each staker is a holder (they either stake the `ETH` themselves or receive `ETH` to stake on behalf of the true owner), this relation is always true. We could also write this as $\text{Holders} \supseteq \text{Stakers}$ (i.e., holders are a superset of stakers).* ***Observation 3;*** $\text{Stakers} \geq \text{NO}, \; \forall \, \text{NO} \in \text{NOs}$ *Since the total set of stakers is composed of pools (which delegate to node operators) and solo stakers, we can say that the stakers are larger than any single node operator. We will explore the situation where a single node operator comes to represent a large proportion of the total stakers. We could also write this as $\text{Stakers} \supseteq \text{NO}, \; \forall \, \text{NO} \in \text{NOs}$ (i.e., stakers are a superset of each node operator).* ***Observation 4;*** $\text{Stakers} \geq \text{Solo}$ *Since the total set of stakers is composed of pools (which delegate to node operators) and solo stakers, we can say that the stakers are larger than the solo stakers. This statement says nothing about the relative size of solo stakers versus other node operators. We could also write this as $\text{Stakers} \supseteq \text{Solo}$ (i.e., stakers are a superset of solo stakers).* ### Desiderata Beyond the four observations above, we can also identify four corresponding outcomes that are "desirable" from the protocol perspective. These are ~by no means~ guaranteed, but rather what we intuitively design for in a healthy ecosystem. ***Desiderata 1;*** $\text{Users} \gg \text{Holders}$ *The simplest goal is that the set of people interacting with dApps, NFTs, stablecoins, DeFi, etc., is a much larger set than the collection of participants holding significant amounts of `ETH` (as adoption increases, it seems reasonable to assume that $\text{Users}$ would grow faster than $\text{Holders}$). This may become increasingly true as we move towards a fee-abstracted world where a user doesn't need to hold `ETH` to pay for gas. Note that we don't specifically focus on "read-only" users who consume blockchain data, but rather are more concerned with those who transact in the Ethereum ecosystem in some way.* ***Desiderata 2;*** $\text{Holders} \gg \text{Stakers}$ *The ratio between holders and stakers corresponds to the "proportion of `ETH` supply staked" – $\approx 22\%$ and [counting](https://www.validatorqueue.com/) as of October 2023. This value plays an important role in the Ethereum protocol. Too low of a value (e.g., $<1\%$) presents the clear issue of insufficient economic security (it is too cheap, in `ETH` terms, to attack the network). Conversely, too high of a value (e.g., $>99\%$) may have second-order effects that are hard to predict. Staking limits (e.g., through an issuance curve that approaches negative infinity as the staked supply increases) and [MEV burn](https://ethresear.ch/t/mev-burn-a-simple-design/15590) (which also reduces the net issuance by diminishing the MEV rewards) are the main "arrows in the quiver" to achieve this outcome – see Vitalik's ["*Paths towards single-slot finality*"](https://notes.ethereum.org/@vbuterin/single_slot_finality) for additional ideas.* *The exact impact of having nearly the entire supply staked ($\text{Holders} \approx \text{Stakers}$) is uncertain. One issue it presents is on the social governance layer. With most of the supply locked in the consensus layer and a majority of users interacting only with derivate versions of `ETH`, the staking pool DAOs or DeFi protocols that issue these derivatives have immense power in the protocol. Another negative aspect is the elimination of the "medium of exchange" property of `ETH` the asset. While `ETH` still behaves like "collateral money" and LSTs denominated in `ETH` preserve the asset's "unit of account" nature, the lack of circulation could pose real threats. Additionally, with a large majority of the `ETH` supply staked, there is no `ETH` that could be deployed to counteract a malicious consensus-layer actor that controls a majority of the stake.* ***Desiderata 3;*** $\text{Stakers} \gg \text{NO}, \; \forall \, \text{NO} \in \text{NOs}$ *From a consensus perspective, the protocol security is improved if the set of $\text{Stakers}$ is significantly larger than any individual $\text{NO}$ (to prevent finality delays, reorg attacks, strong censorship, etc.). It's important to note that some node operators have a coordination layer between them, while others may be completely independent. One of the main points of disagreement with regards to staking pools, using Lido for example, is whether to treat them as a single node operator with $32\%$ of the stake or 31 distinct operators with around $1\%$ stake each. There are reasonable arguments on both sides and this article isn't aimed at addressing that discussion. On the other hand, centralized staking providers, using Coinbase for example, are best understood as single node operators with $10-19\%$ of the total stake.* ***Desiderata 4;*** $\text{Solo} \gg \emptyset$ *This simply states that we want the set of solo stakers to be far from non-empty. It seems likely that solo stakers will only ever constitute a relatively small portion of the total stakers (current [estimates](https://blog.rated.network/blog/solo-stakers) are around $5\%$ of the total stake), but solo stakers do represent a much larger portion of the total nodes in the system.* With this framework, let's examine a few hypothetical distributions. We start with four "simplified" cases (labeled `Cases 1-4`). We call them simplified because they only focus on the sets we have defined so far and follow from the observations above. We then analyze four "extended" cases (labeled `Cases 5-8`). Each of the extended cases explores a more realistic aspect of the staking ecosystem; the goal of these thought experiments is to tease out how the simplified model can be made more realistic. We conclude with a set of open questions, each associated with one of the cases. ## Four "simplified" cases ***`Case 1`*** $$ \text{Users} \gg \text{Holders} \gg \text{Stakers}; \\ \;\; \text{Stakers} \gg \text{NOs}; \;\; \text{Solo} \gg \emptyset $$ "Balanced" (best outcome) <img src=https://storage.googleapis.com/ethereum-hackmd/upload_017f21d1dff0c7bd49e5cc45aca794c6.png width=60%> - **Figure** – The user-holder-staker relationship is depicted as three shrinking circles to demonstrate that $\text{Users} \supset \text{Holders} \supset \text{Stakers}.$ Three node operators and the solo stakers compose the staker set (here we don't distinguish between decentralized staking pools and centralized staking providers). The relative size of the node operator and solo staker circles represent the "number of participants from the view of the consensus layer", which is why a single node operator is larger than the total set of solo stakers. - **Summary** – This represents the most balanced outcome we could hope for. With $\text{Holders} \gg \text{Stakers}$, a smaller portion of the `ETH` supply is staked, and with $\text{Stakers} \gg \text{NOs}$, no single node operator has an outsized influence over the consensus layer. It is worth noting that node operators can have different levels of decentralization (e.g., the behavior of some NOs might be correlated through a shared governance layer á la Lido). We will touch on this more in `Case 5`, but for now, the aspect we are focused on is that the different node operators are relatively similar in size. - **Pros** - Some of the `ETH` supply is not staked. - No single node operator has outsized influence over the consensus layer. - Solo staking still meaningfully exists. - **Cons** - None. --- ***`Case 2`*** $$\text{Users} \gg \text{Holders} \gg \text{Stakers} > \text{NO1}; \\ \text{NO1} \gg \text{NO2,}\,\text{NO3,}\,\text{Solo}$$ "Winner-take-most" (medium outcome) <img src=https://storage.googleapis.com/ethereum-hackmd/upload_74fc1ac925c981a07397b82b3b90aa7a.png width=60%> - **Figure** – The "winner-take-most" node operator is represented by $\text{NO1}$ taking up a significant portion of $\text{Stakers}$, while $\text{NO2,}\,\text{NO3,}\,\text{Solo}$ are represented by smaller circles. It is a much better outcome if $\text{NO1}$ is a Lido-style staking protocol as opposed to a centralized staking provider. - **Summary** – This scenario extends what we see today where a single pool benefits from a "winner-takes-most" market structure. With only [$22\%$](https://www.validatorqueue.com/) of `ETH` staked, we can claim $\text{Holders} \gg \text{Stakers}$ (for now). Lido, which for this example we treat as a single node operator, controls a significant percentage of the total stake ($\approx 32\%$), and other pools lag so we cannot claim that the pools are evenly distributed (note that this figure is slightly worse than today's reality and instead represents a world where a single pool controls $\geq 50\%$ of the total stake). Instead, we denote $\text{Stakers} > \text{NO1}$ and $\text{NO1} \gg \text{NO2,}\,\text{NO3,}\,\text{Solo}$, meaning that $\text{NO1}$ has an outsized influence over the stakers. - **Pros** - Some of the `ETH` supply is not staked. - Solo staking still meaningfully exists. - **Cons** - A single node operator has outsized influence over the consensus layer. --- ***`Case 3`*** $$\text{Users} \gg \text{Holders} \approx \text{Stakers}; \\\;\; \text{Stakers} \gg \text{NOs}; \;\; \text{Solo} \gg \emptyset$$ "Full supply for staking pool distribution" (medium outcome) <img src=https://storage.googleapis.com/ethereum-hackmd/upload_d9a480a9a0e110c22b54cac4bf640882.png width=60%> - **Figure** – This near equivalence between holders and stakers is represented by two similarly sized circles. The stakers, on the other hand, mirror `Case 1`, with $\text{NO1,} \, \text{NO2,} \, \text{NO3}$ approximately even in size, implying a more balanced stake distribution among node operators. Solo stakers still constitute a non-trivial portion of stakers. - **Summary** – Consider the reality where a large portion of the `ETH` supply is staked, so we must acknowledge that $\text{Holders} \approx \text{Stakers}$. However, with a more balanced stake distribution among the pools, we can claim that $\text{Stakers} \gg \text{NOs}$. This situation could arise if, for example, changes [to the protocol](https://vitalik.eth.limo/general/2023/09/30/enshrinement.html) made it easier for staking pools to compete but more likely that all `ETH` is staked directly or delegated. - **Pros** - No single pool has outsized influence over the consensus layer. - Solo staking still meaningfully exists. - **Cons** - Governance layer impact of not having `ETH` holders. - With almost all `ETH` staked, the monetary properties of `ETH` the asset change. - There is no "dry powder" of non-staked `ETH` that could enter the consensus layer. --- ***`Case 4`*** $$\text{Users} \gg \text{Holders} \approx \text{Stakers} > \text{NO1};\\ \text{NO1} \gg \text{NO2,}\,\text{NO3,}\,\text{Solo}$$ "Too big" (bad outcome) <img src=https://storage.googleapis.com/ethereum-hackmd/upload_5d96d9facce85354f22b23c0f3dab5c6.png width=60%> - **Figure** – There is near equivalence between stakers and holders and $\text{NO1}$ significantly outweighs $\text{NO2,} \, \text{NO3,} \, \& \, \text{Solo}$. - **Summary** – This case combines the worst aspects of `Cases 2 & 3` above. With a large percentage of the `ETH` supply staked, we have $\text{Holders} \approx \text{Stakers}$. Similarly, a single pool controls a significant portion of the staked `ETH` so $\text{Stakers} > \text{NO1}$ and $\text{NO1} \gg \text{NO2,}\,\text{NO3,}\,\text{Solo}$. Not only does $\text{NO1}$ control a majority of the staked `ETH`, but also a majority of the total `ETH` supply. - **Pros** - Solo staking still meaningfully exists. - **Cons** - A single pool has outsized influence over the consensus layer. - Governance layer impact of not having `ETH` holders. - With almost all `ETH` staked, the monetary properties of `ETH` the asset change. - There is no "dry powder" of non-staked `ETH` that could enter the consensus layer. ## Four "extended" cases While the above cases are (hopefully) easy to follow and intuitive, they lack a bit of grounding in reality. Each of the next four cases extends this model to better reflect what we are seeing today and what we might expect in the coming years. None of these examples aim to be comprehensive either, they just add some nuance (again... hopefully lol). --- ***`Case 5` – Staking protocols vs centralized staking providers*** "Decentralized staking protocol $\neq$ centralized staking provider" <img src=https://storage.googleapis.com/ethereum-hackmd/upload_551f5616ab09c4db558d8ef95cf6dad3.png width=80%> - **Figure** – Focusing on the stakers, we see that `PoolA` is composed of 30 distinct node operators (labeled `NO1, NO2, ..., NO30`), while centralized exchanges `CEXA` and `CEXB`, on the other hand, are single logical entities. - **Summary** – Until this point, we have considered all shared staking services under the umbrella of "node operators". However, there is a massive difference in decentralization between a staking protocol composed of many node operators (Lido) when compared to a centralized exchange that offers staking-as-a-service to retail and institutional customers (Coinbase). Izzy deep dives into a few hypothetical allocations in [*"Concerns around centralization of stake"*](https://hackmd.io/@Izzy-/EthereumStakingCodex#Concerns-around-centralization-of-stake). This distinction is important to keep in mind when considering how the LST market could evolve. For example, one possible outcome of Lido self-limiting could be more stake flowing into centralized exchanges. It's also possible that `stETH` holders are intentionally avoiding CEXs and would reallocate to a different, more decentralized solution – we can only speculate. --- ***`Case 6` – Minimum-viable issuance and solo-stakers*** "Solo stakers are priced out by falling rewards resulting from minimum-viable issuance" <img src=https://storage.googleapis.com/ethereum-hackmd/upload_4c27b9c4be673e07988aa6254ed5e686.png width=70%> - **Figure** – If dramatic changes to the issuance are required to ensure that $\text{Holders} \gg \text{Stakers}$ (e.g., negative rewards), it is possible that solo stakers are priced out and all that remains are delegated node operators ($\text{Solo} = \emptyset$). - **Summary** – "Minimum-viable issuance" is the goal of creating an `ETH` supply mechanism that ensures the economic security of Ethereum without overpaying the consensus layer participants. This is a relatively new line of thought and Anders' [*megathread*](https://notes.ethereum.org/@anderselowsson/MinimumViableIssuance) is a clear voicing of the ideas. The biggest question around this framing is whether or not it completely removes the viability of solo staking. For example, consider the case where the consensus layer issuance adjusts based on staking demand. If the issuance curve goes negative (e.g., staking \*costs\* `ETH`), solo stakers may be completely priced out if the only way to have positive expected rewards is through pooling (for MEV smoothing) and/or restaking to earn additional yield. --- ***`Case 7` – `AppX` getting too big to fail*** "`AppX` is too big to fail, making them the ultimate arbiter of consensus-layer truth." <img src=https://storage.googleapis.com/ethereum-hackmd/upload_5af88bcfad7d84099f2f064bed32cff8.png width=65%> - **Figure** – If $\text{Holders} \approx \text{AppX Users}$, then exact stake distribution of the $\text{Stakers}$ set is essentially a moot point. The users and developers of `AppX` will have an immense influence over the future of the protocol and the coordination of a user-activated soft fork (abbr. UASF) becomes much easier. - **Summary** – If there is a single app on Ethereum that gains such adoption that a majority of users are also using this app ($\text{Users} \approx \text{AppX Users}$), then any issue with this app could immediately raise difficult questions around forking the chain. For example, if a large stablecoin project that accounted for a majority of the value on the chain used its leverage to determine which chain was canonical by identifying where redemptions would be respected. Gwart touches on this in ["*The Real Fork*"](https://gwart.substack.com/p/the-real-fork), where they discuss the potential value misalignment between those who are using the chain regularly and the protocol designers. --- ***`Case 8` – Restaking finding mass adoption*** "All `ETH` is restaked, adding another layer of delegation and incentive (mis)alignment." <img src=https://storage.googleapis.com/ethereum-hackmd/upload_2f04e0e7c60a9c4d3b3c5e6c03a9571e.png width=65%> - **Figure** – If $\text{Stakers} \approx \text{Restakers}$, then the protocol ["loses visibility"](https://barnabe.substack.com/p/seeing-like-a-protocol) into the incentives of the restaked `ETH` node operator set. - **Summary** – Imagine a world where nearly all staked `ETH` is restaked. In this case, the exact distribution of stake among node operators is rivaled by the distribution of restaked `ETH` among the set of node operators in the restaking protocol. We have another layer of trust in the system, which results in another layer of risk. In this world, a misbehaving restaked `ETH` node operator set could do outsized damage to the consensus layer of Ethereum. ## Open Questions - **Cases 2 & 3.** *If we had to choose between Case 2 ("winner-take-most") and Case 3 ("fully staked supply"), which would be preferable? I.e., a smaller portion of supply staked but more concentration vs. a larger portion of supply staked but less concentration.* - **Cases 2.** *Should we try to target a percentage of the `ETH` supply staked? Should we instead consider targeting a fixed amount of `ETH` staked? How do we choose this target? (Again referencing [Anders' work](https://twitter.com/weboftrees/status/1710704461750944190) on this topic.) Given the protocol is unaware of the market cap of `ETH` or the amount of value it secures, is there any reasonable target beyond, "as much as possible"?* - **Cases 2 & 4.** *How can we monitor the soft power of large ecosystem players? How real is the threat of erosion of values? How do we determine when something becomes "too powerful" compared to competitors? Why are some monopolies (e.g., uniswap) considered less harmful?* - **Case 5.** *How should we think about staking protocols that have a distributed node operator set but a shared coordination and governance layer?* - **Case 6.** *What can we do to ensure solo staking remains feasible? Do we try to limit the total amount of `ETH` staked by decreasing issuance? Does that immediately price out solo staking? Do we need to move quickly to avoid a situation where a majority of `ETH` is staked?* - **Case 6.** *Is there a "minimal-viable" [change](https://vitalik.eth.limo/general/2023/09/30/enshrinement.html) to the protocol to encourage liquid staking competition? Are we okay with liquid staking arriving at an oligopolistic market structure? If we choose to change the penalties to encourage competition, what are the second- and third-order effects? Would there still be demand for an additional trust layer over the slashable portion of the stake?* - **Case 6.** *Is it possible to offer "discounts" to solo stakers to ensure they remain viable despite much higher per-unit of staked `ETH` costs? Are there in-protocol mechanisms (e.g., ["validator metadata"](https://ethresear.ch/t/how-optional-non-kyc-validator-metadata-can-improve-staking-decentralization/17032) ) that could protect solo stakers while pressuring large node operators to self identify? Can restaking protocols deliver value to solo stakers without being sybil attacked by delegated operators?* - **Case 7.** *What is the risk if a single app becomes much larger than the staker set? If almost all Ethereum users make use of that app, does that pose a systemic risk to the protocol? How do we avoid ["overloading consensus"](https://vitalik.ca/general/2023/05/21/dont_overload.html)?* - **Case 8.** *How do we deal with the risk associated with a restaking protocol growing to approximately the scale of the consensus stakers set? How can we ensure that the node operators on the restaked `ETH` have a healthy distribution? What is the true "economic security" of restaked `ETH` if the restaking node operator set is not well distributed?*

Read more

L1 R&D session: Proposer-Builder Separation

Issuance Issues — Initial Issue^ Playing with the polysemic nature of the word, issue, to mean both "a vital or unsettled matter" and "the thing or the whole quantity of things given out at one time." sorry lol... couldn't help it :)

Issuance Issues — Subsequent Soliloquy^ real ones will know that Soliloquy is probably the best green run at Copper Mountain :)

Consider the ePBS