Phase 2 Proposal 1

# Phase 2 Proposal 1 ### Introduction This document describes a proposal for modifications to the beacon chain, and definition of the shard state and shard state transition function, for phase 2 (state and transaction execution) of ethereum 2.0. The general ethos of the proposal is to have a relatively minimal consensus-layer framework, that still provides sufficient capabilities to develop complex frameworks that give us all of the smart contract capabilities that we need on top as a second layer. The first part of the document describes the framework; the second part gives a basic example of implementing in-shard ETH transfers on top of the framework.<br> The basic idea of the approach is that contracts as a base-layer concept exist only on the beacon chain, and ETH only exists on the beacon chain (ETH can be held by either beacon chain contracts [which we'll call "execution scripts"] or by validator accounts). However, shards continue to have their own execution and their own state. A transaction on a shard must specify which contract it calls, and the in-shard state transition function executes the specific execution script code using the transaction as data, with the code execution having the ability to read and write from a region of the state of any shard reserved for that execution script and issue receipts. It turns out that this provides sufficient functionality to allow an execution environment that supports smart contracts in shards, cross shard communication and all of the other features that we expect to be built using a beacon chain contract. ### Out of scope for now * EIP 1559 on eth2 * Shard rewards and penalties more generally * Shard block data serialization into transactions (and pokes) * Implementation of standardized SSZ Merkle proofs ### Standardized SSZ Merkle proofs * `ShardReceiptRootProof`: proves that a given `ShardReceipt` was included as part of the `receipts` of a `ShardState` which either is in the shard's current `latest_state_roots` or is an ancestor of a state in the shard's current `latest_state_roots` * `GlobalShardReceiptRootProof`: proves that a given `ShardReceipt` was included as part of the `receipts` of a `ShardState` which was included in a crosslink of some shard in some `BeaconState` which either is in the current `latest_state_roots` or is an ancestor of a state in the current `latest_state_roots` * `WithdrawalReceiptRootProof`: proves that a given `WithdrawalReceipt` was included as part of a `BeaconState` which either is in the current `latest_state_roots` or is an ancestor of a state in the current `latest_state_roots` ### Beacon Chain Changes On the beacon chain we add three new transaction types: `NewExecutionScript` (basically, creates a new execution script whose code lives on the beacon chain and which can hold ETH): ```python { "sender": uint64, "slot": uint64, "code": bytes "pubkey": BLSPubkey, "signature": BLSSignature } ``` `NewValidator` (adds a new validator using ETH taken from an execution script's balance; the operation is authorized via that execution script issuing a receipt in a shard): ```python { "executor": uint64, "receipt": ShardReceipt, "proof": GlobalShardReceiptRootProof } ``` `Withdrawal` (withdrawal of a beacon chain validator, transferring its ETH to an execution script and issuing a receipt): ```python { "validator_index": uint64, "target": uint64, "data": bytes "pubkey": BLSPubkey, "signature": BLSSignature } ``` We add to the `BeaconState` two new data structures: `ExecutionScript`: ```python { "code": bytes, "balance": Gwei } ``` `WithdrawalReceipt`: ```python { "receipt_index": uint64, "withdrawal": Withdrawal, "amount": Gwei } ``` The beacon state stores a list of each object; the latter is emptied at the beginning of every slot. The beacon state also stores a counter `next_withdrawal_receipt_index`. We also add to `DepositData` a field `min_timestamp`; we add to the `process_deposit` function a requirement that the beacon chain's observed timestamp must be at least that value and less than that value plus `MIN_VALIDATOR_PERSISTENCE_TIME` (set to 1 year). This plus the pubkey uniqueness requirement are used for replay protection. The new transaction types are processed as follows: ```python def process_new_execution_script(state: BeaconState, new_execution_script: NewExecutionScript) -> None: # Verify there is sufficient balance fee = NEW_CODE_FEE + NEW_CODE_BYTE_FEE * len(code) assert state.balances[new_execution_script.sender] >= fee # A new-execution-script is valid in only one slot assert state.slot == new_execution_script.slot # Sender must be not yet eligible for activation or withdrawn sender_acct = state.validator_registry[new_execution_script.sender] assert ( sender_acct.activation_eligibility_epoch == FAR_FUTURE_EPOCH or get_current_epoch(state) >= sender_acct.withdrawable_epoch ) # Verify that the pubkey is valid assert ( sender_acct.withdrawal_credentials == BLS_WITHDRAWAL_PREFIX_BYTE + hash(new_execution_script.pubkey)[1:] ) # Verify that the signature is valid assert bls_verify( pubkey=new_execution_script.pubkey, message_hash=signing_root(new_execution_script), signature=new_execution_script.signature, domain=get_domain(state, DOMAIN_TRANSFER) ) # Verify that the code is valid WASM and not too long assert ( verify_wasm(new_execution_script.code) and len(new_execution_script.code) <= MAX_CODE_LEN ) # Add the new execution script to the beacon state decrease_balance(state, new_execution_script.sender, fee) state.execution_scripts.append(ExecutionScript( code=new_execution_script.code, balance=0 )) ``` ```python def process_new_validator(state: BeaconState, new_validator: NewValidator) -> None: # Verify the receipt proof assert verify_global_receipt_root_proof( state, new_validator.receipt, new_validator.proof ) # Receipt target 2**256-1 corresponds to new validator assert new_validator.receipt.target == 2**256 - 1 # Interpret receipt data as DepositData object deposit_data = deserialize(new_validator.recent.data, DepositData) # Check that there's enough ETH in the execution script assert new_validator.executor < len(state.execution_scripts) new_validator_acct = state.execution_scripts[new_validator.executor] assert new_validator_acct.balance >= deposit_data.amount # Equivalent to `process_deposit` except it removes the initial code # that verifies the parts of the Deposit outside the DepositData assert process_deposit_data(state, deposit_data) # Subtract the ETH from the execution script's balance new_validator_acct.balance -= deposit_data.amount ``` ```python def process_withdrawal(state: BeaconState, withdrawal: Withdrawal) -> None: # Sender must be withdrawable withdrawer_acct = state.validator_registry[withdrawal.validator_index] withdrawer_balance = state.balances[withdrawal.validator_index] assert get_current_epoch(state) >= withdrawer_acct.withdrawable_epoch # Verify that the pubkey is valid assert ( withdrawer_acct.withdrawal_credentials == BLS_WITHDRAWAL_PREFIX_BYTE + hash(withdrawal.pubkey)[1:] ) # Verify that the signature is valid assert bls_verify( pubkey=withdrawal.pubkey, message_hash=signing_root(withdrawal), signature=withdrawal.signature, domain=get_domain(state, DOMAIN_WITHDRAWAL) ) # Add a withdrawal receipt state.withdrawal_receipts.append(WithdrawalReceipt( receipt_index=state.next_receipt_index, withdrawal=withdrawal, amount=withdrawer_balance )) # Transfer funds to the execution script assert withdrawal.target < len(state.execution_scripts) state.execution_scripts[withdrawal.target].balance += withdrawer_balance # Delete the validator state.balances[withdrawal.validator_index] = 0 state.validator_registry[withdrawal.validator_index] = Validator() ``` ### Shard processing A `ShardTransaction` has the following format: ```python { # The execution script that will be called to execute the transaction "executor": uint64, # The transaction's underlying contents "data": bytes, } ``` The `ShardState` has the following format: ```python { # What we think of as the actual "state" "objects": [[StateObject, 2**256], 2**64], # Receipts "receipts": [Receipt], "next_receipt_index": uint64, # Current slot "slot": uint64, # Historical state DBMAccumulator "latest_state_roots": [bytes32, LATEST_STATE_ROOTS_LENGTH], "historical_state_roots_accum": [bytes32] } ``` `StateObject` is defined as follows: ```python { # Version number for future compatibility "version": uint64, # Contents "storage": bytes, # StateObject can be "poked" and removed if it expires (ie. now > ttl) "ttl": uint64 } ``` `ShardReceipt` is defined as follows: ```python { # Unique nonce "receipt_index": uint64, # Execution script that the receipt is created by "executor": uint64, # Address that it is intended for "target": bytes32, # Data "data": bytes } ``` The state transition for a _transaction_ in a shard block, `apply_transaction(beacon_state: BeaconState, shard_state: ShardState, transaction: ShardTransaction)` is simply a matter of running `exec(beacon_state.execution_scripts[transaction.executor].code, transaction.data)`, giving the VM the ability to call the following functions: * `setStorage(key: bytes32, value: bytes)`: verifies that `len(value) < MAX_STORAGE_LENGTH`, and sets `shard_state.objects[executor][address] = StateObject(version=0, storage=value, ttl=max(state.objects[executor][key].ttl + STATE_OBJECT_TTL_EXTENSION, shard_state.slot + STATE_OBJECT_BASE_TTL))`. * `getStorageValue(key: bytes32) -> bytes`: returns `shard_state.objects[executor][key].storage` * `saveReceipt(target: bytes32, data: bytes)`: sets `shard_state.receipts.append(Receipt(state.next_receipt_index, executor, target, data))` and `shard_state.next_receipt_index += 1` * `executeCode(code: bytes, data: bytes) -> bytes`: runs `exec(code, data)` as a pure function (except for the ability to use `staticCallExecutionScript`) and returns the output * `staticCallExecutionScript(id: uint64, data: bytes) -> bytes`: runs `exec(beacon_state.execution_scripts[_id], data)` as a pure function (except for the ability to call lower-index execution scripts) and returns the output * `get_recent_beacon_state_root(slot: int) -> bytes32`: self-explanatory. Returns an error if the desired root is not in the `latest_state_roots` array of the `beacon_state` * `get_recent_shard_state_root(slot: int) -> bytes32`: self-explanatory. Returns an error if the desired root is not in the `latest_state_roots` array of the `shard_state` * `getShard() -> uint64`: returns the shard it's executing on Each shard block would have a gas limit of N gas; transactions being applied in a block would consume gas from this pool. Fee payment from transaction senders to block proposers is not solved in the base consensus layer, and is left as problem for higher layers (see end of this article for one possibility). A shard block can specify addresses to **poke**. A poke is processed as follows: ```python def process_poke(state: ShardState, executor: uint64, address: bytes32): assert state.objects[executor][address].ttl < state.slot state.receipts.append(ShardReceipt( receipt_index=state.next_receipt_index, executor=executor, target=address, data=state.objects[executor][address].storage )) state.next_receipt_index += 1 ``` The shard state has a per-slot state transition function that is called before executing any transactions: ```python def start_slot(state: ShardState): latest_roots = state.latest_state_roots latest_roots[state.slot % RECENT_STATE_ROOTS_LENGTH] = hash_tree_root(state) state.slot += 1 if state.slot % RECENT_STATE_ROOTS_LENGTH == 0: state.historical_state_roots_accum.append(hash_tree_root(latest_roots)) state.receipt = [] ``` ### Implementing in-shard ETH transfers On top of the above base, it's possible to implement an entire fully fledged smart-contract-capable state execution framework through higher layers of software abstraction. To start off, here is how one might set up the simplest possible framework, one that simply allows users to deposit an ETH balance to a shard, move the ETH around, and then later withdraw it. In the beginning, we will make the simplifying assumption that state objects last forever, so pokes cannot happen; later we will relax this assumption. We first define our own SSZ classes: `EthAccount`: ```python { "pubkey": BLSPubkey, "nonce": uint64, "value": uint64 } ``` `FormattedReceiptData`: ```python { "shard_id": uint64, "pubkey": BLSPubkey } ``` We can now define our functions (these would be in the WASM code of the execution script that we create). First we define the function `depositToShard`, which "consumes" a withdrawal receipt and publishes the ETH into an account on the desired shard, which is intended to be called by a transaction in the desired shard with the transaction's `data` containing the encoded function call, and with the `executor` being the ID of this execution script. Here is the code: ```python def depositToShard(state: BeaconState, receipt: WithdrawalReceipt, proof: WithdrawalReceiptRootProof): # Verify Merkle proof of the withdrawal receipt assert verify_withdrawal_receipt_root_proof( get_recent_beacon_state_root(proof.root_slot), receipt, proof ) # Interpret receipt data as an object in our own format receipt_data = deserialize(receipt.withdrawal.data, FormattedReceiptData) # Check that this function is being executed on the right shard assert receipt_data.shard_id == getShard() # Check that the account does not exist yet assert getStorageValue(hash(receipt_data.pubkey)) == b'' # Set its storage setStorage(hash(receipt_data.pubkey), serialize(EthAccount( pubkey=receipt_data.pubkey, nonce=0, value=receipt.amount ))) ``` Now `transfer` for transferring ETH between accounts, hopefully self-explanatory without comments: ```python def transfer(sender: bytes32, nonce: uint64, target: bytes32, amount: uint64, signature: BLSSignature): sender_account = deserialize(getStorageValue(sender), EthAccount) target_account = deserialize(getStorageValue(target), EthAccount) assert nonce == sender_account.nonce assert sender_account.value >= amount assert bls_verify( pubkey=sender_account.pubkey, message_hash=hash(nonce, target, amount), signature=signature ) setStorage(sender, EthAccount( pubkey=sender_account.pubkey, nonce=sender_account.nonce + 1, value=sender_account.value - amount )) setStorage(target, EthAccount( pubkey=target_account.pubkey, nonce=target_account.nonce, value=target_account.value + amount )) ``` Now `sendToValidatorDeposit`, for sending the ETH in an account back into a validator slot: ```python def sendToValidatorDeposit(account: bytes32, nonce: uint64, signature: BLSSignature, deposit_data: DepositData): # Verify that the provided deposit data is valid assert verify_deposit_data(deposit_data) # Get the account data account_data = deserialize(getStorageValue(account), EthAccount) # Verify balance sufficiency assert account_data.value >= deposit_data.amount # Verify the signature assert bls_verify( message_hash=hash_tree_root({nonce: nonce, deposit_data: deposit_data}), pubkey=account_data.pubkey, signature=signature ) # Save the reduced balance setStorage(sender, EthAccount( pubkey=account.pubkey, nonce=account.nonce + 1, value=account.value - deposit_data.amount )) # Save a receipt saveReceipt(2**256-1, deposit_data) ``` ### Dealing with Expiry Now, what happens if state objects do not last forever? Then the replay protection given above does not work, because after an account's balance is spent one can simply wait until it disappears, and then run `depositToShard` again to recover the originally deposited funds. We get around this as follows. We first create a helper function `check_and_set_bitfield_bit`: ```python def check_and_set_bitfield_bit(bitfield_id: uint64, bit: uint64): # Start position of this specific bitfield in the state bitfield_start_position = (2**64 / BITFIELD_ENTRY_BIT_LENGTH) * bitfield_id # Index of the specific chunk in the bitfield bitfield_chunk_index = bit // BITFIELD_ENTRY_BIT_LENGTH # Maximum chunk that has already been set max_set_chunk_index = bytes8_to_int(getStorageValue(2**128)) # Generate any not-yet-generated bitfields while bitfield_index > max_set_chunk_index: setStorage( bitfield_start_position + max_set_chunk_index + 1, b'\x00' * (BITFIELD_ENTRY_BIT_LENGTH // 8) ) max_set_chunk_index += 1 setStorage(2**128, int_to_bytes8(max_set_chunk_index)) # Verify that the chunk exists and the bit not yet filled, fill the bit chunk = getStorageValue(bitfield_start_position + bitfield_index) assert len(chunk) == BITFIELD_ENTRY_BIT_LENGTH // 8 assert get_bitfield_bit(chunk, bit % BITFIELD_ENTRY_BIT_LENGTH) == 0 set_bitfield_bit(chunk, bit % BITFIELD_ENTRY_BIT_LENGTH, 1) setStorage(bitfield_start_position + bitfield_index, chunk) ``` The function's operation depends crucially on item 2**128 of the storage never expiring; this ordinarily should not happen because every call to `check_and_set_bitfield_bit` extends it, but to be safe when the system it initialized it is a good idea to call this function a few thousand times to push the TTL very far in the future. We now also create a function `revive`, which uses a bitfield to prevent double-spending: ```python def revive(address: bytes32, proof: ReceiptRootProof, receipt: Receipt): # Verify that the receipt is correct assert verify_receipt_root_proof(get_recent_shard_state_roots(), receipt, proof) # Verify that the receipt was made by our executor assert receipt.executor == executor # check_and_set_bitfield_bit(0, receipt.receipt_index) assert getStorageValue(receipt.target) == b'' setStorage(receipt.target, receipt.data) ``` So now if a state object expires due to its TTL running out, we can revive it. Note that if the bitfield needed to revive the account itself expires, that can also be revived, and so on recursively. Now, we can modify our above "minimal OS": ```python def verifyReceipt(state: BeaconState, receipt: WithdrawalReceipt, proof: ReceiptRootProof): assert verify_receipt_root_proof(get_recent_beacon_state_roots(), receipt, proof) assert receipt.data.shard_id == getShard() check_and_set_bitfield_bit(1, receipt.receipt_index) setStorage(hash(receipt.data.pubkey), serialize(EthAccount( pubkey=receipt.data.pubkey, nonce=0, value=receipt.value )) ``` We can now safely replace `setStorage(account, b'dead')` with `setStorage(account, b'')`. But now we need to deal with a special edge case: what happens if an account expires, then receives ETH, then gets revived? There are two challenges here: (i) merging balances, (ii) replay protection. We make two changes. First, in the `transfer` function, we replace the code: ```python setStorage(target, EthAccount( pubkey=target_account.pubkey, nonce=target_account.nonce, value=target_account.value + amount )) ``` With the following code: ```python if len(getStorageValue(target)) == 0: new_nonce = state.slot * 100000000 else: new_nonce = target_account.nonce setStorage(target, serialize(EthAccount( pubkey=target_account.pubkey, nonce=new_nonce, value=target_account.value + amount ))) ``` This ensures that if an account is created a new, it's created with new nonce space that has not been used before. Second, we modify `revive` as follows: ```python def revive(address: bytes32, proof: ReceiptRootProof, receipt: Receipt): assert verify_receipt_root_proof(get_recent_shard_state_roots(), receipt, proof) assert receipt.executor == executor check_and_set_bitfield_bit(0, receipt.receipt_index) # If account is empty, revive from receipt if getStorageValue[receipt.target] == b'': setStorage(receipt.target, receipt.data) # Otherwise, verify that it's in the space of ETH accounts # and not bitfield entries, and combine them else: assert receipt.target > 2**128 current_account = deserialize(getStorageValue(receipt.target), EthAccount) receipt_account = deserialize(receipt.data, EthAccount) setStorage(receipt.target, serialize(EthAccount( pubkey=receipt_account.pubkey, nonce=current_account.nonce, value=current_account.value + receipt_account.value ))) ``` We also add a restriction that the `target` of a transfer must be `> 2 ** 128`. And we're done. ### From ETH transfers to complete state execution To create a complete framework, we would need to add the following components on top of this (ie. this is all more executor code, no changes required to the above consensus layer): * Add a `crossShardMessage` function, which creates a receipt specifying a destination shard in addition to target and state, and a function for reviving with these messages * Add a proper `Transaction` object, with the main components being `revives` (list of accounts to revive + receipts for reviving them), `gasPrice` (transaction fee per gas), `operations` (list of actions the transaction takes), `witness`. Signature verification is no longer BLS-specific; instead, we use the abstracted `assert executeCode(account.witness_verifier, (account.state, tx.witness)) == 1`. * Add a proper transaction execution state transition function, which processes all of these parts of a transaction * In addition to sending ETH, add another type of operation, a contract call. A contract call roughly works by starting with `stack = operations[::-1]`, then `while len(stack > 0)`, pop the top operation off the stack and run `executeCode(target.code, (target.state, operation.calldata))`; this would be expected to return `(new_state, continuation)`; the state transition function would set the state to equal the new state, and then add the continuation to the stack if any. * Add the conditional state logic from https://ethresear.ch/t/fast-cross-shard-transfers-via-optimistic-receipt-roots/5337 ### Generic fee payment To allow validators to be able to collect transaction fees without every client needing to implement every layer-2 scheme, we can create a generic abstraction layer as follows. We create a specialized layer-2 scheme where anyone can publish a message of the form "If you create a block in shard X at slot Y, where the previous state root is Z, then I will give you N gwei", and processing this kind of conditional transaction is the only operation. Then for each user-side layer-2 there can be a separate class of users that gather transactions and publish packages that bid using this system. Note that this market can even be implemented (and be useful) during phase 1, though the code would sit on the PoW chain rather than a layer-2 scheme. ### State size and waking data complexity estimates Suppose 10 tx/sec, where each tx touches on average 2 not-recently-touched state objects with 1kB of new storage+code. Suppose that `STATE_OBJECT_BASE_TTL` = 1 week (600k seconds). With 20 new state objects per second, that's on average 12 million state objects in the state, with 1kB * 10 * 600k = 6 GB of state [note: this seems bigger than it would be in reality; check what more realistic numbers are?] Waking an old object requires recovering its hibernation receipt but also the hibernation receipt of the bitfield entry that contains its most recent hibernation, and so on recursively. If an object disappears at time T, then the minimum time at which its bitfield entry disappears would be T + 600k seconds. Hence, waking an old object that has been hibernated for N seconds would require `N/T` Merkle branches (eg. if N = 10 years, T = 600k seconds, that's ~526 Merkle branches), which in more extreme cases would need to be split across multiple transactions. Note that this is an extreme worst case; in the average case, there would be plenty of clients accessing old bitfield entries, so many of them would already be in the active state. One could make `STATE_OBJECT_BASE_TTL` proportional to `1/(32 + storage_size)`, and make the bitfield chunks extremely small (eg. 32 bytes), so in the case of bitfield chunks the minimum expiry time would be longer (eg. if it's 1 week for 512-byte objects, it would be ~8 weeks for 32-byte objects, cutting the number of Merkle branches needed to resurrect a 10 year old object to ~40).