State and execution minimisation

# State and execution minimisation ### Background * "Minimisation" means not using first-class infrastructure too much * Three basic types of first-class infrastructure - Data availability (bandwidth) - Execution (computation) - State (storage and I/O) * Nomenclature - "Enshrined", "first-class" or "default" infrastructure - "Alternative" or "custom" infrastructure - Extreme minimisation - Plasma (alternative availability, execution and state) * Evolution of consensus abstraction - Bitcoin (think "ASIC") * Enshrined-in-consensus dApp * Account abstraction - Ethereum 1.0 (think "CPU") * Enshrined-in-consensus dApp engine * dApp abstraction - Ethereum 2.0 (think "FPGA") * Enshrined-in-consensus data availability * dApp engine abstraction * Example: SNARK verification (e.g. Zcash on Ethereum) - General purpose EVM too slow: ~0.33 SNARKs/sec * Naive execution * 2M gas per SNARK, 10M gas per 15 sec block - Specialised SNARK verification engine: ~100 SNARKs/sec * "Optimistic non-execution" based on re-execution challenges * 127 bytes per SNARK, 1MB per 75 sec collation * Context - Sharding research led to study of alternative execution engines - Main goal: fast shuffling of validators * Two orthogonal solutions - Stateless clients - Tool: cryptography - Pros: no I/O and storage (for validators) - Cons: bandwidth overhead, costs pushed to users - Delayed execution - Tool: cryptoeconomics - Pros: no I/O, storage and computation (for collators) - Cons: costs pushed to executors - New sharding consensus * Consensus game limited to data availability (no notion of collation validity) * dApp developers can define custom scalable execution games - More performance and less costs than sharded EVM - Custom features not available in sharded EVM * Model for resource usage - Cube (8 vertices) of alternative execution * Three resource axes - Data availability (bandwidth) - Execution (computation) - State (storage and I/O) - Scalability hybrids * Plasma and full EVM execution at diagonal extremes in the cube * Hybrid vertices on the cube * Thesis: - Applications can make infrastructure-level tradeoffs - Extremisms (Plasma and full EVM execution) are not the only options - Possible sweet spot by leveraging: * Fungible data availability provided by consensus * Custom computation and storage * Sharded EVM tradeoffs: 1) "Doing too much": Paying for a generic execution engine that you don't need - Cost for users - Gas costs, storage rent - Having ETH in the first place - Cost for full nodes - Have to download full collation bodies - Have to know the full state (in particular, for gas payments) - Possibly undesirable compromises - Latency of producing state roots for light clients - Expensive general purpose cross-shard transactions * Recalculation costs and finality costs across all shards - Parallelism limited to 8 cores - Subsidised functions (e.g. SHA3 opcode, precompiles) 2) "Doing too little": Being limited by what the sharded EVM can do - Gas limit, storage limit - Missing features today - Cross-shard transactions - Load balancing - Access lists (e.g. GPU parallelism) - Zones * Plasma tradeoffs: Cost of "exit" because of failures in delegated data availability - Design complexity - Monitoring, execution, storage costs pushed to users - Security tradeoffs around mass exists - Dependent on operators for clearing QoS (censorship resistance, bandwidth, latency, fees) ### Stateless clients (state minimisation) * Stateless clients: Cheaper executing nodes - Fast sync, fast shuffling - No I/O bottleneck - Purely cryptographic solution * Tradeoffs - Bandwidth overhead - Costs pushed to users * Storage of data * Witness maintenance - Monitoring and updating costs * Addressing bandwidth overhead - Batching - Selective offloading * Infrequently accessed stuff is best offloaded * Compare that to offloading everything (no caching) - Choice of accumulator * "Accumulator" means succinct cryptographic object keeping track of a set of things * Patricia tree: O(log n)-sized witnesses * RSA accumulator: O(1)-sized witnesses - "Fancy crypto" not suitable for protocol layer - Trap door - Not quantum secure * SNARK accumulator: O(1)-sized witnesses - Compress 64 SHA256 hashes (2kB witness) into 127 bytes (5 seconds prover time, scales linearly) - Predicted use-case for SNARKs: compression of Merkle witnesses * Addressing witness maintenance costs pushed to users * Selective offloading - Only offload storage "owned" by a single party (similar to Plasma Cash) - Break ownership coupling - Canonical example: UTXO model (one owner per UTXO) * Dynamic accumulators (support deletions in addition to additions) - With Patricia tries every update affects all witnesses * Partially addressed by witness auto-update - Inherent monitoring cost of dynamic accumulators * "On the impossibility of batchable accumulators" - Have to update witnesses for deletions * Rule of thumb: If crypto doesn't work, try crypto * Break witness coupling - Cryptoeconomic dynamic accumulators * Append-only (non-dynamic) "blob" accumulator at the base layer * Economic layer to handle dynamism - Example: * Batched blob accumulator (think: compressed header chain) - Small linear cost in storage (e.g. 32 bytes per 1,000 collations) - Single witness extension - O(1) witnesses * 20 * 32 bytes to chunk root, plus 10 * 32 bytes Merklelised buffer extension * Fraud proof dynamism - Queue of collateralised "ADD" and "DEL" operations (e.g. "MINT" and "SPEND" for UTXOs) - DEL operations are coupled with a previous ADD operation - Slashing if two DELs for a single ADD (think: double-spend) ### Alternative execution engines (execution minimisation) * The fraud proof of the dynamic cryptoeconomic accumulator is an example of "alternative execution" - No a priori consensus on the validity of DEL operations * Primitive enforcement mechanisms: - Re-execution challenges (subjective or objective) - GHOST-style voting - Execution proofs (e.g. SNARKs, STARKs) - Non-interactive fraud proofs - Interactive verification (TrueBit) - Committee voting * Composing enforcement mechanisms: Zcash on Ethereum - Naive EVM execution too slow: one verification per 3 seconds - Zcash transactions fed to a three-layer execution engine (100 verifications per second): * Top-level optimistic SNARK engine * Objective re-execution "middleware" engine * Bottom-level sharded EVM * Specialisation - Applications make infrastructure-level tradeoffs - Think "FPGA": custom execution engines * Application-specific zones - Goal: application-specific full node * Only store state of the application (as opposed to full shard state) * Only download blobs relevant to application (as opposed to full collation bodies) - Detach application from ETH-based gas 1) Become a proposer, include ETH-free transactions yourself 2) Application-specific gas - ERC20 token with internal fee mechanism denominated using the token - Existing proposers can opt-in to this new gas mechanism * Application-specific "sub-proposers" bid "sub-proposals" to proposers - Filtering of application-specific blobs * Custom execution engine "init code" - The equivalent of abstraction for alternative execution engines - Enabled by the EVM's notion of "local gas" proportional to blob size - Init code filter blobs and issues logs * Multi-shard execution engines (for high-bandwidth applications) - The main chain provides a "global clock" - Can do unambiguously ordered "concatenation"/"braiding" of collation bodies - Synchronous transactions over multiple shards, no receipts/witnesses * Reasons why enshrined availability can be cheaper than enshrined execution and state - Parallelism versus sequentiality * Availability (bandwidth) mostly parallel thing * EVM execution (compute and I/O) in every shard is limited by Amdahl's law - Bandwidth vs computation improvements * Neilsen's law for bandwidth shows no signs of slowdown * Moore's law slowdown - Protocol constants can favour availability (vs execution and storage) * Collation size versus gas limit * Gas cost of transaction data versus gas cost of execution * Total local gas versus global gas limit