[DRAFT v2] EIP: Virtual Call

# [DRAFT v2] EIP: Virtual Call Draft by @protolambda. Special thanks to Karl Floersch for help with v1. ## Simple Summary Defines a new call opcode which enables virtualization within EVM. ## Abstract Add `VIRTUALCALL` which behaves similarly to the other call opcodes like delegate call & static call. It accepts a `virtualizerAddress` and virtual-memory input and output, in addition to the 6 fields that are normally supplied to STATICCALL. Once the call is executed, all **stateful** opcodes are execute as calls to the contract located at `virtualizerAddress`. ## Specification Create a new opcode `VIRTUALCALL` which is called with: ``` [ PUSH32 <memoutsize> PUSH32 <memoutstart> PUSH32 <meminsize> PUSH32 <meminstart> PUSH20 <virtualizeraddr> PUSH32 <memvirtinsize> PUSH32 <memvirtinstart> PUSH32 <memvirtoutsize> PUSH32 <memvirtoutstart> PUSH20 <to> PUSh32 <gas> ] ``` This initializes a new virtual context with virtualizer memory. The virtual context is similar to the `readOnly` context which is used for STATICCALL. This can be represented as: ``` Context: { enabled: bool, virtualizer: address, virtualMemory: bytes } ``` - `virtualizer`: the address of the contract to call for execution of non-pure opcodes - `virtualizerMemory`: calldata prefix, the "memory" of the virtualizer. - Initialized as slice of memory, using `<memvirtinsize>` and `<memvirtinstart>` stack arguments - Read from calldata in the virtualizer - Written to with return-data in the virtualizer - Available throughout the execution of a `VIRTUALCALL` - Output to memory area specified with `<memvirtoutsize>` and `<memvirtoutstart>` stack arguments of the `VIRTUALCALL`. The first byte will contain a `1` for success, and `0` for failed execution. ### Example Once enabled, the access pattern of a virtualized `SSTORE` and `SLOAD` looks like: ```sequence OriginContract->codeContract: VIRTUALCALL(...virtualargs) codeContract->codeContract: perform stateless opcodes... codeContract->EVM: SSTORE(key, value) EVM->virtualizerContract: Call virtualizeraddr\nWith:\nMutable memory and stack.\nVirtualizer memory and opcode as calldata. virtualizerContract->virtualizerContract: read opcode, modify memory and stack virtualizerContract->EVM: return updated virtualmemory EVM->codeContract: continue execution with updated memory and stack codeContract->codeContract: perform stateless opcodes... codeContract->EVM: SLOAD(key) EVM->virtualizerContract: Call virtualizeraddr\nWith:\nMutable memory and stack.\nVirtualize memory and opcode as calldata. virtualizerContract->virtualizerContract: read opcode, modify memory and stack virtualizerContract->EVM: return updated virtualmemory EVM->codeContract: continue execution with updated memory and stack codeContract->codeContract: perform stateless opcodes... codeContract->OriginContract: return RETURNDATA ``` ### Virtualizer context The virtualizer is called with a context that maintains the address metadata of the original `VIRTUALCALL`: `msg.sender` and `tx.origin`. I.e. the sandbox does not escape the sandbox privileges. It is up to the virtualizer to track the `msg.sender` and `tx.origin` of the sandbox. ### Calldata The structure of call data, in relaxed BNF: ``` virtualizerMemory := <bytes> opcode := <byte> calldata := <virtualizerMemory><opcode> ``` The `virtualMemory` is prefixed, so the `VIRTUALCALL` caller can select a method, and prepare arguments, for the the first non-pure opcode occurence. The `<opcode>` is appended, to specify what the sandbox attempted to run, for the virtualizer to implement or deny. ### Memory, returndata and sandboxed reverts Once the virtualizer completes the substitute execution of a non-pure opcode from the sandbox, it is expected to: - Leave the stack in the exact shape the sandbox will continue from - Assert that the sandbox memory after the free-memory pointer is fully zeroed, to not accidentally overwrite anything of that in the virtualizer. - On failure, the virtual call as a whole ends, with the result byte set to failure. - Leave the the first part of memory in the exact shape the sandbox will continue from - Make the free-memory pointer at `0x40` point to what *would* be the end of the sandbox memory - The (optionally modified) virtualizer memory is appended after the memory that is in-use by the sandbox. TODO: can we clean up the virtualizer return-data situation? (memory is shared between virtualizer and sandbox, it's possible, but clunky). This goes deep into the memory model as well, needs some review and iteration. ### Reverting from within the virtualizer If some intercepted opcode is illegal by the virtualizer, or otherwise invalid behavior, the virtualizer can `REVERT`. This then stops the sandboxed exection of the `VIRTUALCALL`, with a `0` result byte to indicate failure. ### List of non-intercepted opcodes By default, an opcode (including opcodes in future forks) will be intercepted. A strict set of pure opcodes is never intercepted: - Program counter (PC) - Push and pop opcodes - Dup opcodes - Bitwise operations - Arithmetic operations - Memory operations - Hash functions - Jumps and jump-destinations - Swaps - CALLCODE - RETURNDATACOPY and RETURNDATASIZE - RETURN ### Notes about special opcodes #### `LOG` Note that Log operations are considered non-pure: although they could be allowed, the logs from the sandbox may not be confused with logs outside of the context. To handle this, the virtualizer can instead catch the log opcode, and run log the sandbox-event (with optional modifications) outside of the sandbox, attached to the virtualizer address. #### `ADDRESS` and `ORIGIN` The address and origin opcodes purity is debatable, but considered non-pure for the purposes of this EIP. This way the virtualizer can implement its own account-abstraction logic. It can track the message sender and origin in its virtualizer memory, and change it to whatever is desired, affecting only the sandbox execution. #### `STOP` and `REVERT` The virtualizer may not intercept this opcode, instead, the `VIRTUALCALL` ends without reverting outer execution, and the virtualizer-memory is moved into the specified memory output location. The virtual execution is marked as failed, with the `0` byte at the start of the specified memory output location. It is up to the `VIRTUALCALL` caller to implement remaining processing, and detect the failed execution from the first byte of output. #### `RETURN` `RETURN` is not intercepted, but the very outer return in the virtual execution will end the `VIRTUALCALL` with the result byte set to `1`. #### Calls to pre-compiles Pre-compiles are intercepted, since they cannot be distinquished from other calls easily and consistently. It is up to the virtualizer to choose which pre-compile to support by proxy. Additionally, a virtualizer can emulate additional pre-compiles, just like calling any other contract. #### Other types of calls With the exception of `CALLCODE`, which runs code within the same sandbox, the calls are all intercepted, since they access storage for contract code. ### Useful links to code in Geth - Add virtual context to [EVMInterpreter](https://github.com/ethereum/go-ethereum/blob/0fda25e471aa0e061396050c3d5e59fbaaf1e7b0/core/vm/interpreter.go#L89) - Check for virtual context in next opcode [here](https://github.com/ethereum/go-ethereum/blob/0fda25e471aa0e061396050c3d5e59fbaaf1e7b0/core/vm/interpreter.go#L230). ## Rationale ### Virtualization The problem many Layer 2 solutions face, especially fraud-proof mechanisms, is to execute bytecode with non-pure opcodes consistently. This fundamentally does not work without control of the non-pure execution, or complex and limiting work-arounds. To enable applications to use arbitrary opcodes in a pure context, the EVM needs the ability to virtualize a call. By acting as a host, a contract can implement substitute functionality of the non-pure behavior. ### Optional statelessness Using a single `VIRTUALCALL`, stateless execution on ethereum is just a matter of a transaction that calls `VIRTUALCALL`: - Initialize the virtual-memory with witness data - Set the virtualizer-address to a community-enshrined virtualizer contract without storage use. The contract can be deployed and assumed available, or might be a precompile - Compute an updated state-root with the virtual-memory output - Log the state-root ### Sandboxed account abstraction Changing the `ADDRESS` or `ORIGIN` opcode behavior can be quite dangerous if the contract authors of this authentication make a mistake. Instead, by sandboxing the opcodes, account-abstraction comes for free, in an environment with well-defined boundaries. Dapps can choose to allow arbitrary access from specific virtualizer contracts that they trust to handle account abstraction in ways they were designed and audited for. ### Virtualized memory The rationale here is that the virtualizer can avoid storage usage during the `VIRTUALCALL`. The virtualizer has memory that is utilized a lot like in functional programming: the state is provided as input, and written as output. This enables: - `VIRTUALCALL` to be used within a `STATICCALL`, making virtualized non-pure opcodes pure to the outside world, if the virtualizer is pure! - No temporary storage usage. Important if the gas-refund for storage-zeroing is removed. This is an alternative. ### Virtualized memory as calldata Instead of introducing a whole new memory API, opcodes, solidity syntax, etc. this EIP proposes to use a short-cut: the virtualizer memory is read from calldata, written in return data. This: - Allows the first 4 bytes of virtual-memory to specify the method of the virtualizer contract to call, and change this during virtual execution. E.g. the virtualizer can act like a state-machine, and choose to continue later virtualized-call handling in a different contract method after the execution payed gas, or selects a specific fork, etc. - After the first 4 bytes, the memory is used for calldata arguments: - The method parameters ABI of the function can be used to structure virtualizer memory - The return parameters ABI can be used to change the state structure (or not, if the method does not change) With this approach, the virtualizer contract could type its way of virtualizer memory: ```solidity contract ExampleVirtualizer { function Hook(address trackedSender, MyDataType someData, uint8 opcode) public returns(address trackedSender, MyDataType memory someData) { ... } } ``` ### Exposing memory and stack in virtualizer To **efficiently** read and write both the stack and memory of the sandbox, the virtualizer contract needs access to this. A previous iteration would put the complete memory and stack in the calldata, but this was not viable. So in v2, the stack and memory from the sandbox context are simply retained at the start of the virtualizer contract call, and written back at the virtualizer call. ### Difference from [EIP 726](https://github.com/ethereum/EIPs/issues/726) `VIRTUALCALL` is intended to just be a minimal opcode, and completely leave the use-case to the application. - It does not enforce a state format - It does not require state roots - It enables the application to choose virtualized opcode behavior. This is good for upgradeability, storage decisions, and access to the non-pure ecosystem where appropriate. - No encoded transactions or any opinionation of virtualized actions. ## Backwards Compatibility Strictly a superset of current opcodes and therefore should be fully backwards compatible. ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). ---- TODO:

Read more

Rayonism - Meta-Spec ☀️

Multi-client post-merge Eth devnet setup

Merge implementers call 2

Eth2.wtf