Phase 2 Proposal 1

Introduction

This document describes a proposal for modifications to the beacon chain, and definition of the shard state and shard state transition function, for phase 2 (state and transaction execution) of ethereum 2.0. The general ethos of the proposal is to have a relatively minimal consensus-layer framework, that still provides sufficient capabilities to develop complex frameworks that give us all of the smart contract capabilities that we need on top as a second layer. The first part of the document describes the framework; the second part gives a basic example of implementing in-shard ETH transfers on top of the framework.

The basic idea of the approach is that contracts as a base-layer concept exist only on the beacon chain, and ETH only exists on the beacon chain (ETH can be held by either beacon chain contracts [which we’ll call “execution scripts”] or by validator accounts). However, shards continue to have their own execution and their own state. A transaction on a shard must specify which contract it calls, and the in-shard state transition function executes the specific execution script code using the transaction as data, with the code execution having the ability to read and write from a region of the state of any shard reserved for that execution script and issue receipts. It turns out that this provides sufficient functionality to allow an execution environment that supports smart contracts in shards, cross shard communication and all of the other features that we expect to be built using a beacon chain contract.

Out of scope for now

Standardized SSZ Merkle proofs

Beacon Chain Changes

On the beacon chain we add three new transaction types:

NewExecutionScript (basically, creates a new execution script whose code lives on the beacon chain and which can hold ETH):

{
    "sender": uint64,
    "slot": uint64,
    "code": bytes
    "pubkey": BLSPubkey,
    "signature": BLSSignature
}

NewValidator (adds a new validator using ETH taken from an execution script’s balance; the operation is authorized via that execution script issuing a receipt in a shard):

{
    "executor": uint64,
    "receipt": ShardReceipt,
    "proof": GlobalShardReceiptRootProof
}

Withdrawal (withdrawal of a beacon chain validator, transferring its ETH to an execution script and issuing a receipt):

{
    "validator_index": uint64,
    "target": uint64,
    "data": bytes
    "pubkey": BLSPubkey,
    "signature": BLSSignature
}

We add to the BeaconState two new data structures:

ExecutionScript:

{
    "code": bytes,
    "balance": Gwei
}

WithdrawalReceipt:

{
    "receipt_index": uint64,
    "withdrawal": Withdrawal,
    "amount": Gwei
}

The beacon state stores a list of each object; the latter is emptied at the beginning of every slot. The beacon state also stores a counter next_withdrawal_receipt_index. We also add to DepositData a field min_timestamp; we add to the process_deposit function a requirement that the beacon chain’s observed timestamp must be at least that value and less than that value plus MIN_VALIDATOR_PERSISTENCE_TIME (set to 1 year). This plus the pubkey uniqueness requirement are used for replay protection.

The new transaction types are processed as follows:

def process_new_execution_script(state: BeaconState,
                                 new_execution_script: NewExecutionScript) -> None:
    # Verify there is sufficient balance
    fee = NEW_CODE_FEE + NEW_CODE_BYTE_FEE * len(code)
    assert state.balances[new_execution_script.sender] >= fee
    # A new-execution-script is valid in only one slot
    assert state.slot == new_execution_script.slot
    # Sender must be not yet eligible for activation or withdrawn
    sender_acct = state.validator_registry[new_execution_script.sender]
    assert (
        sender_acct.activation_eligibility_epoch == FAR_FUTURE_EPOCH or
        get_current_epoch(state) >= sender_acct.withdrawable_epoch
    )
    # Verify that the pubkey is valid
    assert (
        sender_acct.withdrawal_credentials ==
        BLS_WITHDRAWAL_PREFIX_BYTE + hash(new_execution_script.pubkey)[1:]
    )
    # Verify that the signature is valid
    assert bls_verify(
        pubkey=new_execution_script.pubkey,
        message_hash=signing_root(new_execution_script), 
        signature=new_execution_script.signature,
        domain=get_domain(state, DOMAIN_TRANSFER)
    )
    # Verify that the code is valid WASM and not too long
    assert (
        verify_wasm(new_execution_script.code) and
        len(new_execution_script.code) <= MAX_CODE_LEN
    )
    # Add the new execution script to the beacon state
    decrease_balance(state, new_execution_script.sender, fee)
    state.execution_scripts.append(ExecutionScript(
        code=new_execution_script.code,
        balance=0
    ))
def process_new_validator(state: BeaconState, new_validator: NewValidator) -> None:
    # Verify the receipt proof
    assert verify_global_receipt_root_proof(
        state,
        new_validator.receipt,
        new_validator.proof
    )
    # Receipt target 2**256-1 corresponds to new validator
    assert new_validator.receipt.target == 2**256 - 1
    # Interpret receipt data as DepositData object
    deposit_data = deserialize(new_validator.recent.data, DepositData)
    # Check that there's enough ETH in the execution script
    assert new_validator.executor < len(state.execution_scripts)
    new_validator_acct = state.execution_scripts[new_validator.executor]
    assert new_validator_acct.balance >= deposit_data.amount
    # Equivalent to `process_deposit` except it removes the initial code 
    # that verifies the parts of the Deposit outside the DepositData
    assert process_deposit_data(state, deposit_data)
    # Subtract the ETH from the execution script's balance
    new_validator_acct.balance -= deposit_data.amount
def process_withdrawal(state: BeaconState, withdrawal: Withdrawal) -> None:
    # Sender must be withdrawable
    withdrawer_acct = state.validator_registry[withdrawal.validator_index]
    withdrawer_balance = state.balances[withdrawal.validator_index]
    assert get_current_epoch(state) >= withdrawer_acct.withdrawable_epoch
    # Verify that the pubkey is valid
    assert (
        withdrawer_acct.withdrawal_credentials ==
        BLS_WITHDRAWAL_PREFIX_BYTE + hash(withdrawal.pubkey)[1:]
    )
    # Verify that the signature is valid
    assert bls_verify(
        pubkey=withdrawal.pubkey,
        message_hash=signing_root(withdrawal),
        signature=withdrawal.signature,
        domain=get_domain(state, DOMAIN_WITHDRAWAL)
    )
    # Add a withdrawal receipt
    state.withdrawal_receipts.append(WithdrawalReceipt(
        receipt_index=state.next_receipt_index,
        withdrawal=withdrawal,
        amount=withdrawer_balance
    ))
    # Transfer funds to the execution script
    assert withdrawal.target < len(state.execution_scripts)
    state.execution_scripts[withdrawal.target].balance += withdrawer_balance
    # Delete the validator
    state.balances[withdrawal.validator_index] = 0
    state.validator_registry[withdrawal.validator_index] = Validator()

Shard processing

A ShardTransaction has the following format:

{
    # The execution script that will be called to execute the transaction
    "executor": uint64,
    # The transaction's underlying contents
    "data": bytes,
}

The ShardState has the following format:

{
    # What we think of as the actual "state"
    "objects": [[StateObject, 2**256], 2**64],
    # Receipts
    "receipts": [Receipt],
    "next_receipt_index": uint64,
    # Current slot
    "slot": uint64,
    # Historical state DBMAccumulator
    "latest_state_roots": [bytes32, LATEST_STATE_ROOTS_LENGTH],
    "historical_state_roots_accum": [bytes32]
}

StateObject is defined as follows:

{
    # Version number for future compatibility
    "version": uint64,
    # Contents
    "storage": bytes,
    # StateObject can be "poked" and removed if it expires (ie. now > ttl)
    "ttl": uint64
}

ShardReceipt is defined as follows:

{
    # Unique nonce
    "receipt_index": uint64,
    # Execution script that the receipt is created by
    "executor": uint64,
    # Address that it is intended for
    "target": bytes32,
    # Data
    "data": bytes
}

The state transition for a transaction in a shard block, apply_transaction(beacon_state: BeaconState, shard_state: ShardState, transaction: ShardTransaction) is simply a matter of running exec(beacon_state.execution_scripts[transaction.executor].code, transaction.data), giving the VM the ability to call the following functions:

Each shard block would have a gas limit of N gas; transactions being applied in a block would consume gas from this pool. Fee payment from transaction senders to block proposers is not solved in the base consensus layer, and is left as problem for higher layers (see end of this article for one possibility).

A shard block can specify addresses to poke. A poke is processed as follows:

def process_poke(state: ShardState, executor: uint64, address: bytes32):
    assert state.objects[executor][address].ttl < state.slot
    state.receipts.append(ShardReceipt(
        receipt_index=state.next_receipt_index,
        executor=executor,
        target=address,
        data=state.objects[executor][address].storage
    ))
    state.next_receipt_index += 1

The shard state has a per-slot state transition function that is called before executing any transactions:

def start_slot(state: ShardState):
    latest_roots = state.latest_state_roots
    latest_roots[state.slot % RECENT_STATE_ROOTS_LENGTH] = hash_tree_root(state)
    state.slot += 1
    if state.slot % RECENT_STATE_ROOTS_LENGTH == 0:
        state.historical_state_roots_accum.append(hash_tree_root(latest_roots))
    state.receipt = []

Implementing in-shard ETH transfers

On top of the above base, it’s possible to implement an entire fully fledged smart-contract-capable state execution framework through higher layers of software abstraction. To start off, here is how one might set up the simplest possible framework, one that simply allows users to deposit an ETH balance to a shard, move the ETH around, and then later withdraw it. In the beginning, we will make the simplifying assumption that state objects last forever, so pokes cannot happen; later we will relax this assumption.

We first define our own SSZ classes:

EthAccount:

{
    "pubkey": BLSPubkey,
    "nonce": uint64,
    "value": uint64
}

FormattedReceiptData:

{
    "shard_id": uint64,
    "pubkey": BLSPubkey
}

We can now define our functions (these would be in the WASM code of the execution script that we create). First we define the function depositToShard, which “consumes” a withdrawal receipt and publishes the ETH into an account on the desired shard, which is intended to be called by a transaction in the desired shard with the transaction’s data containing the encoded function call, and with the executor being the ID of this execution script. Here is the code:

def depositToShard(state: BeaconState,
                   receipt: WithdrawalReceipt,
                   proof: WithdrawalReceiptRootProof):
    # Verify Merkle proof of the withdrawal receipt
    assert verify_withdrawal_receipt_root_proof(
        get_recent_beacon_state_root(proof.root_slot),
        receipt,
        proof
    )
    # Interpret receipt data as an object in our own format
    receipt_data = deserialize(receipt.withdrawal.data, FormattedReceiptData)
    # Check that this function is being executed on the right shard
    assert receipt_data.shard_id == getShard()
    # Check that the account does not exist yet
    assert getStorageValue(hash(receipt_data.pubkey)) == b''
    # Set its storage
    setStorage(hash(receipt_data.pubkey), serialize(EthAccount(
        pubkey=receipt_data.pubkey,
        nonce=0,
        value=receipt.amount
    )))

Now transfer for transferring ETH between accounts, hopefully self-explanatory without comments:

def transfer(sender: bytes32,
             nonce: uint64,
             target: bytes32,
             amount: uint64,
             signature: BLSSignature):
    sender_account = deserialize(getStorageValue(sender), EthAccount)
    target_account = deserialize(getStorageValue(target), EthAccount)
    assert nonce == sender_account.nonce
    assert sender_account.value >= amount
    assert bls_verify(
        pubkey=sender_account.pubkey,
        message_hash=hash(nonce, target, amount),
        signature=signature
    )
    setStorage(sender, EthAccount(
        pubkey=sender_account.pubkey,
        nonce=sender_account.nonce + 1,
        value=sender_account.value - amount
    ))
    setStorage(target, EthAccount(
        pubkey=target_account.pubkey,
        nonce=target_account.nonce,
        value=target_account.value + amount
    ))

Now sendToValidatorDeposit, for sending the ETH in an account back into a validator slot:

def sendToValidatorDeposit(account: bytes32,
                           nonce: uint64,
                           signature: BLSSignature,
                           deposit_data: DepositData):
    # Verify that the provided deposit data is valid
    assert verify_deposit_data(deposit_data)
    # Get the account data
    account_data = deserialize(getStorageValue(account), EthAccount)
    # Verify balance sufficiency
    assert account_data.value >= deposit_data.amount
    # Verify the signature
    assert bls_verify(
        message_hash=hash_tree_root({nonce: nonce, deposit_data: deposit_data}),
        pubkey=account_data.pubkey,
        signature=signature
    )
    # Save the reduced balance
    setStorage(sender, EthAccount(
        pubkey=account.pubkey,
        nonce=account.nonce + 1,
        value=account.value - deposit_data.amount
    ))
    # Save a receipt
    saveReceipt(2**256-1, deposit_data)

Dealing with Expiry

Now, what happens if state objects do not last forever? Then the replay protection given above does not work, because after an account’s balance is spent one can simply wait until it disappears, and then run depositToShard again to recover the originally deposited funds. We get around this as follows.

We first create a helper function check_and_set_bitfield_bit:

def check_and_set_bitfield_bit(bitfield_id: uint64, bit: uint64):
    # Start position of this specific bitfield in the state
    bitfield_start_position = (2**64 / BITFIELD_ENTRY_BIT_LENGTH) * bitfield_id
    # Index of the specific chunk in the bitfield
    bitfield_chunk_index = bit // BITFIELD_ENTRY_BIT_LENGTH
    # Maximum chunk that has already been set
    max_set_chunk_index = bytes8_to_int(getStorageValue(2**128))
    # Generate any not-yet-generated bitfields
    while bitfield_index > max_set_chunk_index:
        setStorage(
            bitfield_start_position + max_set_chunk_index + 1,
            b'\x00' * (BITFIELD_ENTRY_BIT_LENGTH // 8)
        )
        max_set_chunk_index += 1
    setStorage(2**128, int_to_bytes8(max_set_chunk_index))
    # Verify that the chunk exists and the bit not yet filled, fill the bit
    chunk = getStorageValue(bitfield_start_position + bitfield_index)
    assert len(chunk) == BITFIELD_ENTRY_BIT_LENGTH // 8
    assert get_bitfield_bit(chunk, bit % BITFIELD_ENTRY_BIT_LENGTH) == 0
    set_bitfield_bit(chunk, bit % BITFIELD_ENTRY_BIT_LENGTH, 1)
    setStorage(bitfield_start_position + bitfield_index, chunk)

The function’s operation depends crucially on item 2**128 of the storage never expiring; this ordinarily should not happen because every call to check_and_set_bitfield_bit extends it, but to be safe when the system it initialized it is a good idea to call this function a few thousand times to push the TTL very far in the future.

We now also create a function revive, which uses a bitfield to prevent double-spending:

def revive(address: bytes32, proof: ReceiptRootProof, receipt: Receipt):
    # Verify that the receipt is correct
    assert verify_receipt_root_proof(get_recent_shard_state_roots(), receipt, proof)
    # Verify that the receipt was made by our executor
    assert receipt.executor == executor
    # 
    check_and_set_bitfield_bit(0, receipt.receipt_index)
    assert getStorageValue(receipt.target) == b''
    setStorage(receipt.target, receipt.data)

So now if a state object expires due to its TTL running out, we can revive it. Note that if the bitfield needed to revive the account itself expires, that can also be revived, and so on recursively. Now, we can modify our above “minimal OS”:

def verifyReceipt(state: BeaconState,
                  receipt: WithdrawalReceipt,
                  proof: ReceiptRootProof):
    assert verify_receipt_root_proof(get_recent_beacon_state_roots(), receipt, proof)
    assert receipt.data.shard_id == getShard()
    check_and_set_bitfield_bit(1, receipt.receipt_index)
    setStorage(hash(receipt.data.pubkey), serialize(EthAccount(
        pubkey=receipt.data.pubkey,
        nonce=0,
        value=receipt.value
    ))

We can now safely replace setStorage(account, b'dead') with setStorage(account, b''). But now we need to deal with a special edge case: what happens if an account expires, then receives ETH, then gets revived? There are two challenges here: (i) merging balances, (ii) replay protection. We make two changes. First, in the transfer function, we replace the code:

    setStorage(target, EthAccount(
        pubkey=target_account.pubkey,
        nonce=target_account.nonce,
        value=target_account.value + amount
    ))

With the following code:

    if len(getStorageValue(target)) == 0:
        new_nonce = state.slot * 100000000
    else:
        new_nonce = target_account.nonce
    setStorage(target, serialize(EthAccount(
        pubkey=target_account.pubkey,
        nonce=new_nonce,
        value=target_account.value + amount
    )))

This ensures that if an account is created a new, it’s created with new nonce space that has not been used before. Second, we modify revive as follows:

def revive(address: bytes32, proof: ReceiptRootProof, receipt: Receipt):
    assert verify_receipt_root_proof(get_recent_shard_state_roots(), receipt, proof)
    assert receipt.executor == executor
    check_and_set_bitfield_bit(0, receipt.receipt_index)
    # If account is empty, revive from receipt
    if getStorageValue[receipt.target] == b'':
        setStorage(receipt.target, receipt.data)
    # Otherwise, verify that it's in the space of ETH accounts
    # and not bitfield entries, and combine them
    else:
        assert receipt.target > 2**128
        current_account = deserialize(getStorageValue(receipt.target), EthAccount)
        receipt_account = deserialize(receipt.data, EthAccount)
        setStorage(receipt.target, serialize(EthAccount(
            pubkey=receipt_account.pubkey,
            nonce=current_account.nonce, 
            value=current_account.value + receipt_account.value
        )))

We also add a restriction that the target of a transfer must be > 2 ** 128. And we’re done.

From ETH transfers to complete state execution

To create a complete framework, we would need to add the following components on top of this (ie. this is all more executor code, no changes required to the above consensus layer):

Generic fee payment

To allow validators to be able to collect transaction fees without every client needing to implement every layer-2 scheme, we can create a generic abstraction layer as follows. We create a specialized layer-2 scheme where anyone can publish a message of the form “If you create a block in shard X at slot Y, where the previous state root is Z, then I will give you N gwei”, and processing this kind of conditional transaction is the only operation. Then for each user-side layer-2 there can be a separate class of users that gather transactions and publish packages that bid using this system.

Note that this market can even be implemented (and be useful) during phase 1, though the code would sit on the PoW chain rather than a layer-2 scheme.

State size and waking data complexity estimates

Suppose 10 tx/sec, where each tx touches on average 2 not-recently-touched state objects with 1kB of new storage+code. Suppose that STATE_OBJECT_BASE_TTL = 1 week (600k seconds). With 20 new state objects per second, that’s on average 12 million state objects in the state, with 1kB * 10 * 600k = 6 GB of state [note: this seems bigger than it would be in reality; check what more realistic numbers are?]

Waking an old object requires recovering its hibernation receipt but also the hibernation receipt of the bitfield entry that contains its most recent hibernation, and so on recursively. If an object disappears at time T, then the minimum time at which its bitfield entry disappears would be T + 600k seconds. Hence, waking an old object that has been hibernated for N seconds would require N/T Merkle branches (eg. if N = 10 years, T = 600k seconds, that’s ~526 Merkle branches), which in more extreme cases would need to be split across multiple transactions. Note that this is an extreme worst case; in the average case, there would be plenty of clients accessing old bitfield entries, so many of them would already be in the active state.

One could make STATE_OBJECT_BASE_TTL proportional to 1/(32 + storage_size), and make the bitfield chunks extremely small (eg. 32 bytes), so in the case of bitfield chunks the minimum expiry time would be longer (eg. if it’s 1 week for 512-byte objects, it would be ~8 weeks for 32-byte objects, cutting the number of Merkle branches needed to resurrect a 10 year old object to ~40).