owned this note
owned this note
Published
Linked with GitHub
## SSZ & Typed Transactions
End goal is for blocks to be SSZ from top to bottom. For this document, I'll use python class syntax for SSZ Container schemas for simplicity.
```python
class Block:
parent: bytes32
transactions_root: bytes32
receipts_root: bytes32
uncles_root: bytes32
```
One property that we get from SSZ is the ability to substitute a hash for the actual object. In this case, lets look at the `parent`.
```python
# schema for a block that references its parent with a hash
class Block:
parent: bytes32
...
# schema for a block that includes the full parent
class BlockWithParent:
parent: Block
...
parent = Block(...)
parent_hash = hash_tree_root(parent, sedes=Block)
block = Block(parent=parent_hash, ...)
block_with_parent = BlockWithParent(parent=parent, ...)
block_hash = hash_tree_root(block, sedes=Block)
block_with_parent_hash = hash_tree_root(block_with_parent, sedes=BlockWithParent)
assert block_hash == block_with_parent_hash
```
This simple example is meant to demonstrate this hash substitution principle.
Lets look at a more practical way that we can take advantage of this property by exploring the `transaction_root`.
```python
class Block:
parent: bytes32
transaction_root: bytes32
...
```
The `transaction_root` in this example is the `hash_tree_root(transaction_list, sedes=TransactionList)`, where `transaction_list` is the list of transactions in the block.
The hash substitution property of SSZ means that we can define the following schema which will have the same hash as the block above.
```python
class BlockWithFullTransactionList:
parent: bytes32
transaction_list: TransactionList
...
```
In the *current* protocol, we have a distinct model that is used for the header chain, and then some ad-hoc models that we use in the DevP2P protocol to ship around block bodies and block receipts. In order to validate the block body or receipts, one must reconstruct the `transaction_trie` or other relevent tries from the received payload to verify the data against the header.
In a world where blocks are SSZ, we have the ability to define a schema like `BlockWithFullTransactionList` which would compute to the same hash as the `Block` model. This enables us to have a minimal schema that is used for the header chain, and a more verbose schema that is more suitable for transmission of this data over the network for use cases like syncing the chain. And since the different schemas result in the same hash, it means that the block hash is the merkle root under which we can create a single merkle proof against any subset of the block data. These proofs can even reach into parent blocks!
Now, lets look at how best to define `TransactionList`.
```python
class TypedTransaction:
type: uint8
txn_root: bytes32
TransactionList = List[TypedTransaction, max_length=...]
```
This simplistic top level model for the `TransactionList` assumes that the `Transaction.hash` is an SSZ hash. Using the hash substitution property, that means that we can also define a richer schema that will result in an equivalent hash.
```python
class Transaction:
to: bytes20
data: List[uint8, max_length=...]
...
class FullTypedTransaction:
type: uint8
transaction: Transaction
FullTransactionList = List[FullTypedTransaction, max_length=...]
```
This allows us to have a both a minimal representation containing only the transaction hashes using `TransactionList` or a verbose representation that contains the actual transaction details using `FullTransactinList`.
In order to take advantage of this, we **must** use a transaction hash that represents an SSZ hash tree root.
Now, we need to extend this to support multiple transaction types.
The ideal solution is to use a `Union` SSZ type. This ensures that we can have merkle proofs against the block root that extend down into the transaction objects themselvs.
If use of a `Union` type is deemed to not be a viable way forward we *could* define the transaction as an opaque binary payload.
```python
class FullTypedTransaction:
type: uint8
payload: List[uint8, max_length=...]
```
In this model, to get at the actual rich transaction object requires parsing the `(type, payload)` and then using the `type` value to determine how `payload` would need to be deserialized. Under this approach, it would be more difficult to make a proof against the details of an individual transaction as the merkle proof would not be able to do more than assert things about the binary payload but not against the individual fields.