We use notation from the yellowpaper.
value | Mnemonic | delta | alpha | Description |
---|---|---|---|---|
0xc0 | ADDMOD384 | (below) | (below) | (below) |
0xc1 | SUBMOD384 | (below) | (below) | (below) |
0xc2 | MULMODMONT384 | (below) | (below) | (below) |
Recall from the yellowpaper that delta is number of elements popped from stack and alpha is number of elements pushed to stack. In the descriptions below, we replace yellowpaper symbols πs, πm, and πi with stack
, memory
, and memoryLength
, respectively. In particular, stack[0]
refers to the first item popped from the stack, stack[1]
is the second item popped from the stack, and so on. stack'[0]
and stack'[1]
are the top and second-to-top item pushed to the stack by the operation. memory[i..i+48]
refers to the memory bytes from byte i
to i+48
. memoryLength
is the current number of 32-byte chunks in memory. Finally, function M(current_memoryLength, start_byte_offset, bytelength_accessed)
is the βmemory expansion for range functionβ from the yellowpaper H.1.
Intuitively, version 0 operates on values in the stack and doesnβt touch memory.
x = (stack[0] | stack[1])[0..48]
y = (stack[2] | stack[3])[0..48]
mod = (stack[4] | stack[5])[0..48]
stack'[0] | stack'[1] = = [ x + y if x+y < mod
[ x + y - mod otherwise
Where | is concatenation.
Where x, y, and mod are in little-endian and the result is little-endian.
x = (stack[0] | stack[1])[0..48]
y = (stack[2] | stack[3])[0..48]
mod = (stack[4] | stack[5])[0..48]
stack'[0] | stack'[1] = = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where | is concatenation.
Where x, y, and mod are in little-endian and the result is little-endian.
x = (stack[0] | stack[1])[0..48]
y = (stack[2] | stack[3])[0..48]
mod[0:48] | r_inv[0:8] = ((stack[4] | stack[5])[0..58]
stack'[0] | stack'[1] = (x * y * r_inv) % mod
Where | is concatenation.
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and r_inv are in little-endian and the result is little-endian.
Intuitively, version 1 stack items are offsets to memory memory, and the output clobbers the first input value.
x = memory[stack[0]..(stack[0]+48)]
y = memory[stack[1]..(stack[1]+48)]
mod = memory[stack[2]..(stack[2]+48)]
memory[stack[0]..(stack[0]+48)] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[0]..(stack[0]+48)]
y = memory[stack[1]..(stack[1]+48)]
mod = memory[stack[2]..(stack[2]+48)]
memory[stack[0]..(stack[0]+48)] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[0]..(stack[0]+48)]
y = memory[stack[1]..(stack[1]+48)]
mod = memory[stack[2]..(stack[2]+48)]
r_inv = stack[3]
memory[stack[0]..(stack[0]+48)] = (x * y * r_inv) % mod
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and r_inv are in little-endian, and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
Intuivively, the stack items are offsets to values in memory. Note: Interface 3 is interface 2, but, unsafely, omits memoryLength checks and updates.
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = memory[stack[3]..(stack[3]+48)]
memory[stack[0]..(stack[0]+48)] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = memory[stack[3]..(stack[3]+48)]
memory[stack[0]..(stack[0]+48)] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = memory[stack[3]..(stack[3]+48)]
r_inv = stack[4]
memory[stack[0]..(stack[0]+48)] = (x * y * r_inv) % mod
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and r_inv are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
Intuitively, this is version 2 and 3 but no stack items for mod
and inv
, they are somehow hardcoded. Version 5 is version 4, but, unsafely, omits memoryLength checks and updates.
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = hard-coded value for modulus
memory[stack[0]..(stack[0]+48)] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = hard-coded value for modulus
memory[stack[0]..(stack[0]+48)] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
x = memory[stack[1]..(stack[1]+48)]
y = memory[stack[2]..(stack[2]+48)]
mod = mod = hard-coded value for modulus
r_inv = hard-coded value for r_inv
memory[stack[0]..(stack[0]+48)] = (x * y * r_inv) % mod
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and r_inv are in little-endian and the result is little-endian.
memoryLength = M( M( M( memoryLength, stack[0], 48 ), stack[1], 48 ), stack[2], 48 )
Intuitively, the beginning of memory has 256 virtual registers, each 48 bytes long. For example, memory[0..48]
is the 0th register and memory[12240..12288]
is the 255th register. The 48-byte modulus must be in memory[12288..12336]
and the 8-byte inverse must be in memory[12336..12344]
. Each bigint opcode takes a single stack item whose least-significant bytes correspond to virtual register indicies of inputs and outputs.
out_idx = stack[0][29] i.e. the third least-significant byte in the top stack item in big-endian
x_idx = stack[0][30] i.e. the second least-significant byte in the top stack item in big-endian
y_idx = stack[0][31] i.e. the least-significant byte in the top stack item in big-endian
x = memory[x_idx*48..x_idx*48+48]
y = memory[y_idx*48..y_idx*48+48]
mod = memory[12288..12336]
memory[out_idx*48..out_idx*48+48] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( memoryLength, 12288, 48 )
out_idx = stack[0][29]
x_idx = stack[0][30]
y_idx = stack[0][31]
x = memory[x_idx*48..x_idx*48+48]
y = memory[y_idx*48..y_idx*48+48]
mod = memory[12288..12336]
memory[out_idx*48..out_idx*48+48] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, and mod are in little-endian and the result is little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( memoryLength, 12288, 48 )
out_idx = stack[0][29]
x_idx = stack[0][30]
y_idx = stack[0][31]
x = memory[x_idx*48..x_idx*48+48]
y = memory[y_idx*48..y_idx*48+48]
mod = memory[12288..12336]
r_inv = memory[12336..12344]
memory[out_idx*48..out_idx*48+48] = (x * y * r_inv) % mod
Where the multiplication operations together are montgomery multiplication, even if modulus is even.
Where x, y, r_inv, and mod are in little-endian and the result is little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( memoryLength, 12288, 56 )
Intuitively, this is 6 with packed 4-byte offsets instead of register indices, including an offset to mod and inv
out_offset = stack[0][16:20]
x_offset = stack[0][20:24]
y_offset = stack[0][24:28]
mod_offset = stack[0][28:32] i.e. least-significant bytes of the big-endian stack item.
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_offset:mod_offset+48]
memory[out_offset:out_offset+48] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_offset, 48 )
out_offset = stack[0][16:20]
x_offset = stack[0][20:24]
y_offset = stack[0][24:28]
mod_offset = stack[0][28:32]
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_offset:mod_offset+48]
memory[out_offset:out_offset+48] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_offset, 48 )
out_offset = stack[0][16:20]
x_offset = stack[0][20:24]
y_offset = stack[0][24:28]
mod_and_inv_offset = stack[0][28:32]
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_rinv_offset:mod_rinv_offset+48]
r_inv = memory[mod_rinv_offset+48:mod_rinv_offset+56]
memory[out_offset:out_offset+48] = (x * y * r_inv) % mod
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian. Similarly, mod_r is an integer of at-most 64-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_and_inv_offset, 56 )
Like version 7, but using big-endian encoding for 384-bit values in memory.
Like version 7, but takes an immediate instead of a stack item.
input = c(πpc+1,πpc+2,...,πpc+16)
where c() is defined in the yellowpaper when defining PUSH*.
out_offset = input[16:20]
x_offset = input[20:24]
y_offset = input[24:28]
mod_offset = input[28:32] i.e. least-significant bytes of the big-endian stack item.
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_offset:mod_offset+48]
memory[out_offset:out_offset+48] = [ x + y if x+y < mod
[ x + y - mod otherwise
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_offset, 48 )
input = c(πpc+1,πpc+2,...,πpc+16)
where c() is defined in the yellowpaper when defining PUSH*.
out_offset = input[16:20]
x_offset = input[20:24]
y_offset = input[24:28]
mod_offset = input[28:32]
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_offset:mod_offset+48]
memory[out_offset:out_offset+48] = [ x - y if x-y >= 0
[ x - y + mod otherwise
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_offset, 48 )
input = c(πpc+1,πpc+2,...,πpc+16)
where c() is defined in the yellowpaper when defining PUSH*.
out_offset = input[16:20]
x_offset = input[20:24]
y_offset = input[24:28]
mod_and_inv_offset = stack[0][28:32]
Where each offset is a big-endian unsigned 32-bit integer.
x = memory[x_offset:x_offset+48]
y = memory[y_offset:y_offset+48]
mod = memory[mod_rinv_offset:mod_rinv_offset+48]
r_inv = memory[mod_rinv_offset+48:mod_rinv_offset+56]
memory[out_offset:out_offset+48] = (x * y * r_inv) % mod
Where the multiplication operation is montgomery multiplication, even if modulus is even.
Where x, y, mod, and result are integers of at-most 384-bits stored in memory as little-endian. Similarly, mod_r is an integer of at-most 64-bits stored in memory as little-endian.
Must also check memory length and possibly grow the memory.
memoryLength = M( M( M( M( memoryLength, x_offset, 48 ), y_offset, 48 ), out_offset, 48 ), mod_and_inv_offset, 56 )
Interface 10 is like interface 7, but with two bytes per offset, instead of four. This is to reduce the address space, to reduce the possibility of DoS attacks investigated here.
Recall Yellowpaper notation:
M(current_memoryLength, start_byte_offset, bytelength_accessed)
is the βmemory expansion functionβ from the yellowpaper H.1.value | Mnemonic | π | π | Description | Gas cost |
---|---|---|---|---|---|
0xc0 | ADDMONT |
0 | 0 | (below) | 2 gas plus 1 gas after each 8 limbs |
0xc1 | SUBMONT |
0 | 0 | (below) | 2 gas plus 1 gas after each 8 limbs |
0xc2 | MULMONT |
0 | 0 | (below) | 2 gas plus floor((n^2)/8) gas for n limbs |
Table 1. Data for Yelowpaper H.2, plus gas cost formulas.
The above gas cost formulas will go in Yellowpaper appendix H.1 function C()
, but we also give some explicit costs for quick reference.
num_64bit_limbs: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 32 | 64 | 128 | 256 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ADDMONT |
2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 5 | 9 | 17 | 33 |
SUBMONT |
2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 5 | 9 | 17 | 33 |
MULMONT |
2 | 2 | 3 | 4 | 5 | 6 | 8 | 10 | 12 | 14 | 17 | 20 | 23 | 26 | 30 | 34 | 130 | 514 | 2050 | 8194 |
Table 2. Gas costs for certain numbers of 64-bit limbs. |
Finally, the opcode descriptions.
ADDMONT
description:immediate = c(π.pc+1,π.pc+2,...,π.pc+10)
where c() (defined with PUSH*) handles overflow as zeros.
Assert immediate[0] == 0x68 (i.e. the leftmost byte is PUSH9), otherwise halt with exception
n = 1 + immediate[1]
n is interpreted as the number of 8-byte words for our big integers.
out_offset = immediate[2..3]
x_offset = immediate[4..5]
y_offset = immediate[6..7]
mod_offset = immediate[8..9]
Each offset is interpreted as a little-endian integer.
x = memory[x_offset..x_offset+n*8]
y = memory[y_offset..y_offset+n*8]
mod = memory[mod_offset..mod_offset+n*8]
x, y, and mod are interpreted as little-endian integers.
memory[out_offset..out_offset+n*8] = [ x + y if x+y < mod
[ x + y - mod otherwise
The result is stored in memory in little-endian, up to n*8 bytes.
Memory length is checked and possibly grown.
memoryLength = M( memoryLength, max(x_offset,y_offset,mod_offset,out_offset), n*8 )
SUBMONT
description:immediate = c(π.pc+1,π.pc+2,...,π.pc+10)
where c() (defined with PUSH*) handles overflow as zeros.
Assert immediate[0] == 0x68 (i.e. the leftmost byte is PUSH9), otherwise halt with exception
n = 1 + immediate[1]
n is interpreted as the number of 8-byte words for our big integers.
out_offset = immediate[2..3]
x_offset = immediate[4..5]
y_offset = immediate[6..7]
mod_offset = immediate[8..9]
Each offset is interpreted as a little-endian integer.
x = memory[x_offset..x_offset+n*8]
y = memory[y_offset..y_offset+n*8]
mod = memory[mod_offset..mod_offset+n*8]
x, y, and mod are interpreted as little-endian integers.
memory[out_offset..out_offset+n*8] = [ x - y if x-y >= 0
[ x - y + mod otherwise
The result is stored in memory in little-endian, up to n*8 bytes.
Memory length is checked and possibly grown.
memoryLength = M( memoryLength, max(x_offset,y_offset,mod_offset,out_offset), n*8 )
MULMONT
description:immediate = c(π.pc+1,π.pc+2,...,π.pc+10)
where c() (defined with PUSH*) handles overflow as zeros.
Assert immediate[0] == 0x68 (i.e. the leftmost byte is PUSH9), otherwise halt with exception
n = 1 + immediate[1]
n is interpreted as the number of 8-byte words for our big integers.
out_offset = immediate[2..3]
x_offset = immediate[4..5]
y_offset = immediate[6..7]
mod_offset = immediate[8..9]
Each offset is interpreted as a little-endian integer.
x = memory[x_offset..x_offset+n*8]
y = memory[y_offset..y_offset+n*8]
mod = memory[mod_offset..mod_offset+n*8]
neg_mod_inv = memory[mod_offset+n*8..mod_offset+n*8+8]
x, y, mod, and neg_mod_inv are interpreted as a little-endian integers.
memory[out_offset..out_offset+n*8] = REDC_multiprecision(x*y,mod,neg_mod_inv,n)
Where the intermediate multiplication operation result may have double the bits.
# Python code:
def REDC_multiprecision(T,N,Nprime,n):
# following the original [Montgomery85, section 2] notation and algorithm
# T is the bigint value to be reduced
# N is the bigint modulus
# Nprime is the 64-bit montgomery parameter
# n is the number of limbs of the modulus
b = 2**64 # the base
# convert inputs to limbs in base b, least-significant limb first
T_=[0]*(2*n+1) # with extra limb for carries
for i in range(2*n): T_[i] = T%b; T = T//b
N_=[0]*n
Nbigint = N # must not destroy N since need it later
for i in range(n): N_[i] = Nbigint%b; Nbigint = Nbigint//b
# main loop
c = 0
for i in range(n):
# from [Montgomery85]: (d T_{i+n-1} ... T_i)_b <- (0 T_{i+n-1} ... T_i)_b + N*(T_i*N' mod R)
TixNprime = (T_[i]*Nprime) % b
d = 0
for j in range(n):
temp = T_[i+j] + N_[j]*TixNprime + d
T_[i+j] = temp % b
d = temp // b
# from [Montgomery85]: (c T_{i+n})_b <- c + d + T_{i+n}
temp = c + d + T_[i+n]
T_[i+n] = temp % b
c = temp // b
T_[2*n] += c
# convert result T_ back to bigint, shifting (ignoring) lowest n limbs
assert T==0
for i in range(n+1): T += T_[n+i] * b**i
# finally, subtract the modulus if we exceed it
if T>=N:
return T-N
else:
return T
The result is stored in memory in little-endian, up to n*8 bytes.
Memory length is checked and possibly grown.
memoryLength = M( memoryLength, max(x_offset,y_offset,mod_offset+8,out_offset), n*8 )
Finally, we modify Yellowpaper section 9.4.3 function N(i,w)
to jump forward from bytecode offset i to i+11 if opcode w is ADDMONT
, SUBMONT
, or MULMONT
.
Note that this means that the PUSH9
opcode in the immediate data right after *MONT
opcodes is not actually executed, it is just there to not break backwards compatibility with immediates clobbering a JUMPDEST
.
Like interface 11, except we have an extra opcode, SETMOD
, which stores a new machine state entry, π .mc, the modulus context, with entries π.mc.modulus and π.mc.length, where the length counts number of 64-bit limbs. At each transaction, the modulus context is initialized to 0 length, but it is not reset at each contract call.
value | Mnemonic | π | π | Description | Gas cost |
---|---|---|---|---|---|
0xc3 | SETMOD |
2 | 0 | (below) | 2 gas plus n*ceil(log2(n)) for n 64-bit limbs |
Table 1. Data for Yelowpaper H.2, plus a gas cost formula.
The above gas cost formulas will go in Yellowpaper appendix H.1 function C()
, but we also give some explicit costs for quick reference.
num_64bit_limbs: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 32 | 64 | 128 | 256 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SETMOD |
3 | 6 | 8 | 14 | 17 | 20 | 23 | 34 | 38 | 42 | 46 | 50 | 54 | 58 | 62 | 82 | 194 | 450 | 1026 | 2306 |
Table 2. Gas costs for certain numbers of 64-bit limbs. |
SETMOD
description:offset = π.s[0] i.e. top stack item
length = π.s[1] i.e. second stack item
offset and length are interpreted as big-endian integers.
If length>2^14 (this may be increased later), only pop stack items and go to the next opcode.
π.mc.length = ceil((max_{0<=i<length} π.m[offset:offset+i]!=0x00)/8), i.e. the most-significant byte index's 64-bit limb index
π.mc.modulus = π.m[offset..offset+8*mu.mc.length]
The modulus is interpreted as a little-endian integer.
Implementer's note:
For odd modulus, it may be wise to also precompute things for fast `MULMONT`.
Using notation in [Montgomery85], N=modulus and R=2^(bytelength), the value to precompute is N' which satisfies R*Rinv-N*N'=1.
(Note, Rinv doesn't need to be computed.) This value N' can be computed using the extended Euclidean algorithm (TODO: details).
Note that N' exists because R and N are coprime.
Note that when using a multi-precision montgomery multiplication algorithm, it may be sufficient to compute just the least-significant 64-bit limb of N'.
Store this N' in optional field mu.mc.N'.
ADD/SUB/MULMONT
These are similar, but they read 7 immediate bytes starting with 0x66 (i.e. push7), they donβt read mod from memory, instead using the modulus π.mc.modulus. Another important difference is that the number of 64-bit limbs for inputs and outputs equals the number of limbs for π.mc.modulus.
MULMONT
is much easier to specify:
If mu.mc.modulus is odd:
π.m[out_offset..out_offset+π.mc.length*8] = (x*y*Rinv) % mod
Where the intermediate multiplication operation may have more bits.
Where Rinv is the value which satisfies RRinv-NN'=1, bute we may never have to compute it.
If mu.mc.modulus is even: don't write anything to memory. (This may be changed.)
Finally, the jumpdest jumps forward 7 instead of 11 bytes for interface 11.
This is interface 12 plus covering the case for even modulus.
This is a WIP. Most recent draft is: https://notes.ethereum.org/@poemm/eip_draft_evm_modular_arithmetic_extensions