owned this note
Linked with GitHub
# Memory Copying in Contracts Deployed on Ethereum
Adoption of bulk memory copying as a native feature for the EVM has previously been [explored](https://github.com/ewasm/EIPs/blob/mcopy/EIPS/eip-draft-mcopy.md) with a proposal to add new opcode `MCOPY`.
To get a measure of contracts which currently perform copying and what savings they may see by being rewritten to use `MCOPY`, analysis of EVM execution traces for 1200 historical blocks starting from block 10537502 was performed.
Copies of single words were identified by examining values loaded from memory via `MLOAD`, tracking their existence as they are shuffled around the stack during execution of other opcodes and noting if these values are stored back to memory via `MSTORE`. When these tracked values are consumed (used as parameters to other opcodes) or duplicated via `DUP`, they are no longer considered.
The analysis also identified occurances of copies larger than a single word (where one invocation of `MCOPY` could replace multiple single-word copies). This is done by looking for multiple single-word copies which operate on source and destination offsets along consecutive 32-byte boundaries.
In addition, Solidity was [augmented](https://github.com/ewasm/solidity/tree/mcopy) to use `MCOPY` for multi-word memory copying helper functions in place of currently-used `MLOAD`/`MSTORE` copy loops. This did not identify any cases of memory copying present after recompiling the Solidity contract test suite.
### Cost Model
A linear cost model for `MCOPY` was calculated by fitting a line through Go-ethereum client [benchmarks](https://github.com/jwasinger/go-ethereum/blob/op_memcopy/core/vm/instructions_test.go#L684-L691) measuring varying copy sizes from 32 bytes to 10kb on an i5-6600K processor (32kb L1 cache).
Model `gas_mcopy(n) = 2 + 2.56(n-1)` where `n` is the number of EVM-words being copied (target gas rate of 20ns/gas, 0.3 seconds of execution for a gas limit of 15,000,000):
In general, the model somewhat overcharges because the benchmarks include the overhead of stack manipulation from Geth's EVM (2 x `PUSH`, 1 x stack popping in `MCOPY` per benchmark iteration). Spikes in the 5000-6000 byte range can be discounted as noise.
### Tracing Results
The occurance of multi-word copies is rare:
| Copy size (EVM words) | Number of occurances during the period traced |
| --- | --- |
| 1 | 115673 |
| 2 | 4806
| 3 | 6132
| 4 | 185
| 5 | 14
| 6 | 283
| 7 | 142
| 8 | 1398
| 9 | 350
| 10| 12
| 11| 30
| 12| 218
| 13 | 83
| 14 | 25
| 15 | 3
| 16 | 1
| 20 | 3
| 34 | 1
| 37 | 1
| 40 | 1
| 42 | 3 |
Adoption of `MCOPY` may be warranted if proposals similar to EVM384 (where copying has been identified as a [significant source of overhead](https://notes.ethereum.org/@poemm/evm384-update5#Memory-Manipulation-Cost) for EC pairing operations) are adopted. Until such use-cases become a reality or other sources of cost-reductions afforded by cheap copying are discovered, adoption of `MCOPY` is not necessary for the EVM. We are interested in collecting other potential use-cases where `MCOPY` can be applied.
**Note:** Scripts and utilities to reproduce the tracing and analysis are at https://github.com/jwasinger/evm-mem-copy-tracer