owned this note
owned this note
Published
Linked with GitHub
# Thoughts on EVM encapsulation format
In-reply-to https://notes.ethereum.org/@axic/evm-object-format
### Adding to the motivation:
- provide gentle way to make forwards-compatible changes to the EVM
- allow better scalability by on-chain verification as done by optimism
We would like to make it obvious which sections of the bytecode is data, so that it does not have to be analyzed for conformity to a certain pattern like the one required by the OVM containerized execution.
Without this, it would be impossible to add arbitrary data like constructor arguments or lookup tables (large constants).
Next to this main goal is the optional goal of removing jumpdest analysis.
If jumpdest analysis is removed, it allows the introduction of multi-byte opcodes without breaking existing contracts.
### Initial byte sequence
The proposal currently cites EIP-154 which uses 0xfe 0x65 0x76 0x6d as an initial sequence to "activate" the encapsulation format on per-account basis.
I'm not sure the magic is really needed. magic byte sequences are mainly used to identify file types in a filesystem, a problem that does not usually arise in Ethereum. Furthermore, it "wastes" 3-4 bytes.
Essentially it uses the invalid opcode plus a specific sequence of bytes following it to activate the format.
Alternatively, we could use a new opcode that consumes items on the stack and is only valid when executed as the first non-push opcode. Following that, we could have non-executable section lengths and other structural data.
The advantage of this way is that it is very clear that we are adding a new feature to the EVM by the means of a new opcode instead of adding a second purpose to the invalid opcode.
Initially, we could just use `0x60 0x01 0xe5`, where `0xe5` is the new opcode that consumes a single item on the stack that (for now) encodes a version number. The opcode `0xe5` is only valid as the second opcode and has to be predeced by a PUSH1 opcode, otherwise execution fails. It fails if its argument is anything except 1, with the expectation that it might succeed in the future for different arguments.
Following these three bytes, we have the section structure or something like that as proposed in https://notes.ethereum.org/@axic/evm-object-format
### Sections
Until it is decided which encoding of a jump table is useful for clients, I think we should make it more optional that is currently expressed.