# EIP Draft: EVM384 and Friends THIS DRAFT IS STILL BEING UPDATED. --- eip: TBD title: EVM384 and Friends (Modular Arithmetic Extensions) authors: TBD discussions-to: https://ethereum-magicians.org/t/evm384-feedback-and-discussion/4533 type: Standards Track category: Core status: Draft created: TBD --- ## Simple Summary Towards scaling and privacy, we propose four new opcodes for modular arithmetic which allow implementing a large class of cryptography in pure EVM. ## Abstract We propose three modular arithmetic opcodes `ADDMONT`, `SUBMONT`, and `MULMONT`. The suffix `MONT` refers to using Montgomery's trick to perform the modular multiplication without division. We also add a fourth opcode `SETMOD` to (i) simplify the spec, (ii) allow arbitrary algorithms (including subquadratic runtime), (iii) precompute things to speed up runtime, (iv) allow arbitrary bitlength (including 256-bit, 384-bit and 768-bit), (v) have fewer runtime checks, (vi) allow even moduli, and (vii) have smaller bytecode size. These opcodes are engineered for low gas cost by avoiding runtime checks and stack overhead. ## Motivation The proposed opcodes would allow users to deploy the functionality of the following EIPs: [BW6-761](https://eips.ethereum.org/EIPS/eip-3026), [BN256 HashToCurve](https://eips.ethereum.org/EIPS/eip-3068), [BLS12-377](https://eips.ethereum.org/EIPS/eip-2539), [BLS12-381](https://eips.ethereum.org/EIPS/eip-2537), [MNT](https://eips.ethereum.org/EIPS/eip-1895), [Ed25519](https://eips.ethereum.org/EIPS/eip-665), [secp256k1](https://github.com/ethereum/EIPs/issues/603), and [linear combinations of curve points](https://eips.ethereum.org/EIPS/eip-1829). And any other cryptography which is bottlenecked by modular arithmetic. Another motivation is deprecating old precompiles by replacing them with EVM bytecode. ## Specification Recall Yellowpaper notation: - 𝛂 is the number of stack items to push, - 𝛅 is the number of stack items to pop, - 𝛍 is the machine state, - 𝛍.pc is the program counter, - 𝛍.m is the memory, - 𝛍.i is the memory length, - 𝛍.m[x..y] is memory from byte offset x to y (inclusive), and - `M(current_memoryLength, start_byte_offset, bytelength_accessed)` is the β€œmemory expansion function” from the yellowpaper H.1. --- | value | Mnemonic | 𝛂 | 𝛅 | Description| Gas cost | | -------- | -------- | --- | --- | --- | --- | | 0xc0 | `ADDMONT` | 0 | 0 | (below) | 2 gas plus 1 gas after each 8 limbs | | 0xc1 | `SUBMONT` | 0 | 0 | (below) | 2 gas plus 1 gas after each 8 limbs | | 0xc2 | `MULMONT` | 0 | 0 | (below) | 2 gas plus floor((n^2)/8) gas for n limbs | | 0xc3 | `SETMOD` | 0 | 2 | (below) | todo | **Table 1.** Data for Yellowpaper H.2, plus gas cost formulas for H.1. --- We also give some costs for quick reference. --- | num_64bit_limbs: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 32 | 64 | 128 | 256 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | `ADDMONT` | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 5 | 9 | 17 | 33 | | `SUBMONT` | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 5 | 9 | 17 | 33 | | `MULMONT` | 2 | 2 | 3 | 4 | 5 | 6 | 8 | 10 | 12 | 14 | 17 | 20 | 23 | 26 | 30 | 34 | 130 | 514 | 2050 | 8194 | | `SETMOD` | todo | | | | | | | | | | | | | | | | | | | | **Table 2.** Gas costs for some specific numbers of 64-bit limbs. --- Finally, the opcode descriptions. We define a new machine state entry, 𝛍 .mc, the modulus context, with entries 𝛍.mc.modulus and 𝛍.mc.length, where the length counts the number of 64-bit chunks ("limbs") of the modulus. For each transaction, the modulus context is initialized to length 0, but it is not zero'd for each `*CALL*` or `CREATE*` for that transaction. We define some values which can be adjusted. `MAX_MODULUS_BYTELENGTH = 2^14` (This may be increased in a later hardfork.) `ALLOW_EVEN_MODULUS_FLAG = FALSE` (This may be switched to TRUE in a later hard-fork.) `ENDIANNESS = little-endian` ### `SETMOD` description: ``` offset = 𝛍.s[0] i.e. top stack item length = 𝛍.s[1] i.e. second stack item offset and length are interpreted as big-endian integers. If length>MAX_MODULUS_BYTELENGTH, only pop stack items and go to the next opcode. length = ceil((max_{0<=i<length} 𝛍.m[offset:offset+i]!=0x00)/8), i.e. the most-significant byte index's 64-bit limb index modulus = 𝛍.m[offset..offset+8*length] The modulus is interpreted as an `ENDIANNESS` integer. if the modulus is odd: 𝛍.mc.length = length 𝛍.mc.modulus = modulus if the modulus is even: if ALLOW_EVEN_MODULUS_FLAG = FALSE: only pop stack items and go to the next opcode if ALLOW_EVEN_MODULUS_FLAG = TRUE: (todo) Implementer's note: It may be wise to precompute some things for fast `MULMONT`. Using notation in Montgomery's original 1985 paper, define N=modulus and R=2^(𝛍.mc.length*64). BΓ©zout's identity is R*Rinv-N*N'=1, where Rinv and N' exist because R and N are coprime, since R is a power of two and N is odd. For fast `MULMONT`, it is wise to precompute N' in `SETMOD`. N' is needed for fast `MULMONT` algorithms. N' can be computed using the extended Euclidean algorithm, or alternative algorithms (TODO: details). Note: clients will likely use a multi-precision montgomery multiplication algorithm, so it may be sufficient to compute just the least-significant 64-bit limb of N'. This N' may be stored in optional field 𝛍.mc.Nprime. ``` ### `ADDMONT` description: ``` immediate = c(𝛍.pc+1,𝛍.pc+2,...,𝛍.pc+7) where c() (defined with PUSH*) handles overflow as zeros. Assert immediate[0] == 0x65 (i.e. the leftmost byte is PUSH6), otherwise halt exceptionally for invalid opcode out_offset = immediate[1..2] x_offset = immediate[3..4] y_offset = immediate[5..6] Each offset is interpreted as an `ENDIANNESS` integer. x = memory[x_offset..x_offset+𝛍.mc.length*8] y = memory[y_offset..y_offset+𝛍.mc.length*8] x and y are interpreted as `ENDIANNESS` integers. memory[out_offset..out_offset+mu.mc.length*8] = [ x + y if x+y < 𝛍.mc.modulus [ x + y - 𝛍.mc.modulus otherwise The result is stored in memory in `ENDIANNESS`, up to 𝛍.mc.length*8 bytes. Memory length is checked and possibly grown. memoryLength = M( memoryLength, max(x_offset,y_offset,out_offset)*mu.mc.length*8 ) ``` ### `SUBMONT` description: ``` immediate = c(𝛍.pc+1,𝛍.pc+2,...,𝛍.pc+7) where c() (defined with PUSH*) handles overflow as zeros. Assert immediate[0] == 0x65 (i.e. the leftmost byte is PUSH6), otherwise halt exceptionally for invalid opcode out_offset = immediate[1..2] x_offset = immediate[3..4] y_offset = immediate[5..6] Each offset is interpreted as an `ENDIANNESS` integer. x = memory[x_offset..x_offset+𝛍.mc.length*8] y = memory[y_offset..y_offset+𝛍.mc.length*8] x and y are interpreted as `ENDIANNESS` integers. memory[out_offset..out_offset+n*8] = [ x - y if x-y >= 0 [ x - y + 𝛍.mc.modulus otherwise The result is stored in memory in `ENDIANNESS`, up to 𝛍.mc.length*8 bytes. Memory length is checked and possibly grown. memoryLength = M( memoryLength, max(x_offset,y_offset,out_offset)*mu.mc.length*8 ) ``` ### `MULMONT` description: ``` immediate = c(𝛍.pc+1,𝛍.pc+2,...,𝛍.pc+7) where c() (defined with PUSH*) handles overflow as zeros. Assert immediate[0] == 0x65 (i.e. the leftmost byte is PUSH6), otherwise halt exceptionally for invalid opcode out_offset = immediate[1..2] x_offset = immediate[3..4] y_offset = immediate[5..6] Each offset is interpreted as an `ENDIANNESS` integer. x = memory[x_offset..x_offset+𝛍.mc.length*8] y = memory[y_offset..y_offset+𝛍.mc.length*8] x and y are interpreted as `ENDIANNESS` integers. If mu.mc.modulus is odd: 𝛍.m[out_offset..out_offset+𝛍.mc.length*8] = (x*y*Rinv) % 𝛍.mc.modulus Where the intermediate multiplication operation may have more bits. Where Rinv is the value which satisfies RRinv-NN'=1, bute we may never have to compute it if we use Montgomery's algorithm which uses N'. If 𝛍.mc.modulus is even: TODO The intermediate multiplication operation result may have double the bits. The result is stored in memory in `ENDIANNESS`, up to 𝛍.mc.length*8 bytes. Memory length is checked and possibly grown. memoryLength = M( memoryLength, max(x_offset,y_offset,out_offset)*mu.mc.length*8 ) ``` #### Jumpdest analysis Finally, we modify Yellowpaper section 9.4.3 function `N(i,w)` to jump forward from bytecode offset i to i+7 if opcode w is `ADDMONT`, `SUBMONT`, or `MULMONT`. Note that this means that the `PUSH9` opcode in the immediate data right after `*MONT` opcodes is not actually executed, it is just there to not break backwards compatibility with immediates clobbering a `JUMPDEST`. ## Rationale We choose opcodes which are primitive enough to be widely useful, but expensive enough to amortize the interpreter loop cost. We surveyed these three opcodes to be "complete" (i.e. cover bottleneckes) for a large class of cryptography, including elliptic curve operations and STARK polynomial evaluations. Montgomery arithmetic is generic and widely used in cryptographic libraries. In special cases, alternative algorithms are slightly faster. Barrett reduction is faster when there is only one multiplication, but we are more concerned about cryptosystems which do long calculations with many multiplications. Some cryptosystems choose primes which allow faster reduction, but these are not generic. Finally, there are some prime-specific speedups to Montgomery's algorithm, which we can support with precomputation in `SETMOD`, but we don't require to keep gas costs simple. Many crypto bugs come from these field optimizations -- we use the standard well-studied montgomery algorithm. Gas costs are quadratic for now. In the future, they may be reduced for larger bigints, since there are subquadratic algorithms for our operations. This complexity is left for future optimizations, when the need arises. Our three opcodes add small complexity, we counted under 200 lines of C code for the four operations. There can be non-standard inputs. For even modulus, inputs greater than the modulus, or neg_mod_inv not corresponding to the modulus, we must have garbage input -> canonical garbage output. Algorithms must be audited to make sure they match the spec. We foresee that such primitive opcodes will allow innovation. Details follow. - There may be tricks which are only possible with primitive opcodes that are not possible with large one-size-fits-all precompiles. - We shouldn't just package off-chain algorithms for on-chain use, but evaluate whether on-chain computation relaxes some constraints of off-chain computation, and rethink how we design on-chain algorithms. See discussion [here](https://ethresear.ch/t/surveying-precomputation-methods-in-cryptography-request-for-help/8606). - Non-constant-time algorithms may be much faster on average than in the worst-case, and we can meter such average runtime with opcodes. - Breakthroughs in cryptography can quickly be implemented with these opcodes. - More abstractly, maybe our current problems may soon disappear once people reframe the problems to use primitive opcodes on EVM. We choose base 2^64 to match modern machine word size. We could also choose base 2, base 2^32, base 2^256, etc. But 2^64 seems the least awkward and best for performance. There are many `MULMONT` algorithms, some with specific optimizations. Special care must be taken to choose algorithms which match the spec. ## Backwards Compatibility Some bytes which used to be invalid opcodes now become valid, which is standard for adding new opcodes. We also have forwards-compatibility by defining immediates with a different `PUSHn` where `n!=6`. ## Test Cases Test case format is `<opcodename> <numlimbs> <out> <x> <y> <modulus> <inv for mulmodmont>`. A test-case generator is [generate_tests.py](https://github.com/poemm/bigint_experiments/tree/master/test_and_bench). ``` addmodmont 4 0x00000000000000000000000000000000000000000000000000000000000002 0x00000000000000000000000000000000000000000000000000000000000002 0x00000000000000000000000000000000000000000000000000000000000003 0x00000000000000000000000000000000000000000000000000000000000001 mulmodmont 4 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0x00000000000000000000000000000000000000000000000000000000000001 0x00000000000000000000000000000000000000000000000000000000000000 mulmodmont 4 0x00000000000000000000000000000000000000000000000000000000000000 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0x00000000000000000000000000000000000000000000000000000000000001 0x00000000000000000000000000000000000000000000000000000000000000 mulmodmont 6 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001 0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 mulmodmont 6 0x52c42e58e8a7ed668e0ab8e5a65b41f54b75ea255559704e9ea6bf11ef9498917b4bc04151a10fba72e4850987d4743 0x12f0fbbc2af26c450106fe4b9d3ab9ad1f19b425bd4d41f12c2f5939efed8c91c9a42bcb76e9c00c7b755c45b083afb 0xb598e3af07f7199d9795ec57c07e2077eeb9e5a170a87a79afa62d8e415982340c0a7552eaa9fdf84fc033eee34cd9b 0xa18e24fc1f39aecd88ce25f5e5ed35aafcefc1e2664084e520b5e8789480f7acdf3d6420d80f4c744da1f039b41f2f6d 0x5b783814c809f72a32aea787a62a588930c439647ebbbd9787d49af12513693f45997e5b52cc10ea28be3af20f2ea4f mulmodmont 6 0x45d2bdefe8b56bb29ab16df0ba97a2d887baf24bd4ffe435f4e26f85a553fa02513c38d4c7fcf702b09547ca40a30727 0x564575ea86c074b19d0746366bd5c9308ddeaeb0ba19d8398a30ae23ecebcee2880a5c57795e8092bb72f5171ad048b0 0x75023098e81cc4c5a8bd6802a4dfa3eb9708a29ca08a474e4e2c4903656e310cab27ed9377a9da3852a4e7683a09cd37 0x209176f94479dd6c8fb4fb607811f3083d1431021d871d32cbb766dec4cc6ca67d9d4474c64bc3bc7845a0dcae0a8779 0x38ac865218c595634d27461d27e7e7b777c16cb1814f6329d01cde0a6d289f381bdf2482a15a72e96741e5673ef122df mulmodmont 6 0x95bdc67419b453feeee418750e997d1b9f060d7e2d5ebfee465717871f08d78e258fee7ff896372aea34a829c177373 0x2ae770aa27a9d033b75db9d3b5882b6c2d5ed49ef193a0e67e3d470a2217b5409b319ae950461a817acdfdd74bd8697b 0x3a5fc7fd834df6fcaf26906d9e804b07bb5d0276db3d2142d8adc38b5b433b5f6b4bf11a2ac6792bee1c74ac25e01d45 0x46065f8c4ecc9532baf6468e4e2beb93dc609ab29bd0c5ec054d3ea7f0983b3bb04c71e4ac58016a05ffe28a0b741273 0x3a1f51b20e003ff7dc4b5a5ec809792a233ff5882b31b68bee0697df41f9c1f94deac4394f60aa7aa1d05944bad62c5c mulmodmont 6 0x5844f932e09ed9228b9c0533f2dac838e6903f4e8cdafae4b926779c1388e4d0353c24e01b217d7a6ef7bf7a945f1049 0x34bb3f521222e8de266980566d7c16d0ffde9ba9ce3b4e56a335f12e7f188847e29502433336eafac4e7a3e0728eaa14 0x9138627b591b9247204e604a454ffb75cf9793ef059bcf965048ccaefc1d5df3617789950fe95df74b2cf517a9363629 0x71f48cd9c10323da14143baeff170315382b40be02892e19498ba1557ac2ac01a9242d7ff4c8cd2991755ec5b29639e7 0x643d9419529f5963dc9c7ec782827a4912f7cb45bae91f3cf8c02f5314d2783cb8b4df60ef0ba3f810b3a8eb57e96f0f mulmodmont 6 0x8326e27e1870ae0d60620aa74ae0d43e90352872955821e3b8d898286d263f08aeff10a5c035596dced4c561df16919e 0x1e426c6defa9057ff9789abfc34c39b8306ff029116a22412298e17b135d94c68f896d25830768dcf7c37a71d15b0cb1 0xa46d9981e1ae140eb6a1aae87470178b8dcf6b9bfe70647722cfc816cce850526e12683d8a4f3bc46efc3882081d199f 0x8f1e011e250ae2e7f2fa64c951565d327c1557160b08f8e772fefc7d1b52f73d5a773f2a3d7275ec44ac99f9b2683da1 0x5d78d434e91bad5d5a08177066e103ceb04478a2f7626f085fe894be3eb6c659cebe7cf05de789e529f3527768f76035 mulmodmont 6 0x832585bb4f5284725b08dc158fa84371bb85b01a76bb2e1e13fcfa66116bfa538342fa873aab9579e7fccd501a642d7 0x9e3181618d212e88263932448dd0079d552db4263df03e734cf3c9ab420031d0a3eab88a1a2d58116941dfe716eca6e 0xfb2372a3c4b872a7486f82412563dc921e4b5edbb39d8282b39ce13bf6bf1a9eeffc237719440c076eecd62cc6141e5 0x66dec6a526c72c728b871da66bb0e7d53b18f0f57da8595565468011a2555951e1898c83da9df641798e89946f55ec13 0x350a997fe6bed496c05ecf77e11d6bd1ef789f1a4a4d396bd36ba9b194f5ffe8a12e4b287b675e46051b80397f490a8 mulmodmont 6 0x40e8a133b5f684c49c9c601f1910b8bff4b63ee02e1d5ce518f85640104e02535cb0913c73f727d0e1dd6929627ffb0b 0x45d4945f3311213c4948e89baf63f43fba409c7609b564f0dd3f6e87c5b71d6f87bafaf912e298763e0f4d53277d44c3 0x4a4a1febc2f79554c36c8755b7a21e97d63b1529421d8805b8b22309144a0da2be99b189ecab1603e7eb2302a08062ff 0xae510375b1f9c15e82e8efd7d106a4e64db6f5e8e4e1507c7c6921f89891d40f4035e21ea815ba9f6ef15b4501c96301 0x93e63b39f25cd4401263af60067c4c6629faa249e08dc6bb7a339018ed6a75a08596c59f572a729aac7403fc9d98dde mulmodmont 6 0x76e24fb33ad59f53d5f68f7a71248784a726f27d1649ec7e3b7ccfaf14235257d64c15388abb2e62c762fcc25ebb2e55 0x51c58d1f8ad988986be65b60e8929606aba9f76ddfd28073370d7c2841aebe723f891cc834ff52ba676e1e9d45338b85 0xa92e90f9968fb7cc7645407963ff53e9d690027f3a819963309b0c6ec6696d489a755912317960558e1a07e5e214d103 0xfcb37aa43d8705161bd62d656f6ca6b0b2fca595fd357740790b324c8b300f123755568f60419ec7bb4b9d706ff2de55 0x1dfa21ed65ff96c10be48cfab796345d0bdefbd7d43623049d04d0cc7a249d61fe387b629f2e73617406846852801f29 mulmodmont 6 0x277e599db1c0bcbebf924aa2abc20a7d77d2867060f8af4c740fcaa216b4cc7b9d6f9ff8445bf362abd19fb37efae80d 0x38040d3d21a4698711b7dd546dfadc8e1642ed89e5b229eff8c8b32e08fd4c43a33ae8936dff4e94d9c7c12176d2a5b9 0x4857f436881cd7ded524bee5d3a47f58613406efd1e31924ce0611cfa57903c72591aba620f0f7c72474ff01625f1c37 0xfc7b53e69eedeaba65acb2ec9e00d698cfe7542d17266973972374ea3a5e956cd131d544f58ced1d181c2c88f140a679 0x1585609d599513c6a415aa352e6db85748c402d110fc2ec2a564fba2f894ff09bd8f94ff42973ad9db87b81677e0a4d3 mulmodmont 6 0x19edf86ced41b1718b069ebcdf550a5934febacf5abc0e5fe2d54b8202dc1e6d1c5321b86ba227285746124341d41518 0x56b1734819b207b0137828096ed32b6179b48eb5c834fc91eb1aa84bcfcd3647e42a2e4b6c0781fde52188ed5e535f4 0x551467287dfecb955b5a0ade98703f6c9a801374f63f911990cd86ed485df0509a4821cef293f7b10e41275987a5bf0b 0x5797775de1a05297bfbad892c8a335e02b4508f3fb62d1bfc1f9efab68742cc6e4ce457ff928ceccf1b24fec4bd06b5d 0x373f2a02554eb2d7266bb9535a29b00fc6adfcb8cec0ee7689e1edc302ff3db0738974a125dfeab1c22ac7987f8ff715 mulmodmont 6 0x3dbcacd84b15077c92e22cd2fd2b65a97f2c8856ac90fe8099dc48c3282b10a5188401ceb98589470cf984a8ea700075 0x1df20b8f74508000fafcf24f62d1ef375fce2e615c6198726cea7aa0675aa7f41b6741156ee1f7b30db196c475581533 0x89903f7cbd80599c7031786e0107d9c7b679d804a313045e34a2de4777ed4ba390015021be400c1bb698049443574c37 0x4b375d87900eade668991eeecb2e3b714a673896ec3e4ff3d6b199f3dccf8f57fdd5c1776f2eab438ad4a02680b1d679 0x2f18e89f291f1602f3830f997dfb8a2071ec6ecc5475292f578cbed3c9c48ebf414377e940bf5ea1440de7d4bbedf468 ``` ## Reference Implementation Montgomery arithmetic operations are very common in cryptography libraries. C and Python: https://github.com/poemm/bigint_experiments, includes various algorithms for `addmodmont`, `submodmont`, `mulmodmont`, and `compute_Nprime`. ## Security Considerations These opcodes are parameterized, and touch variable lengths of memory. This is nothing new, since we already have `SHA3`. The new security concern is that we touch four arbitrary memory offsets. This warranted a [security analysis](https://notes.ethereum.org/@poemm/EVM384SecurityAnalysis) which influenced our decision to limit our addressable memory offsets to 2^16 bytes. We need a security analysis of different algorithms for montgomery multiplication, and whether they agree for "strange" inputs. Gas cost rules are formulas. This is nothing new, since we already have opcodes like `EXP`. A security analysis is needed for gas costs. Gas costs for `MULMONT` are aggressive. Clients are expected to use hardware instructions to multiply 64-bits times 64-bits to 128-bits. Most modern architectures (ARMv8, x86_64) support such multiplications. A decision needs to be made whether to restrict the maximum modulus size.