# ZKVM Guest Program Memory Layout The purpose of this document is to present different approaches to guest program memory layout implemented by ZKVM projects and to start a discussion about a common [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) level memory layout used by all ZKVMs. The current status of memory layout across different ZKVM implementations requires guest program recompilation with ZKVM-specific memory layouts defined by custom linker scripts or linker flags. ## Before We Start All described ZKVM guest program binaries are in [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) format and the compilation target is always [RISC-V](https://en.wikipedia.org/wiki/RISC-V) with different extension sets. They are all some combination of I, M, and A extensions. The detailed compilation target for each case is defined below. Before getting into details, we need to define the memory section types that a guest binary program consists of. Each program after loading occupies some memory space which is split into several different section types depending on their role. We can define them as: - **Program I/O** section is used by the ZKVM to pass program input data and receive program output. - **ZKVM data** section contains ZKVM implementation-specific data. It can be, for example, offsets in memory to some specific data, runtime flags, or RISC-V register values. - **Code** section consists of any ELF section which occupies memory during execution. They can be: - `.text`, `.init` - executable code section, read-only - `.rodata` - read-only arbitrary data - `.data`, `.bss` - arbitrary data - Others depending on ZKVM implementation. - **Stack** - memory section reserved for program stack. - **Heap** - memory section reserved for heap memory. Each memory layout is described in a table with virtual memory start address and size of the section. `.` means "end of last section". `ALIGN(k)` means that the section start is aligned on a `k` byte boundary in memory. Memory access modes are: - `ro` - read only - `wo` - write only - `rw` - read or write - `na` - no access It is worth noting that different ZKVMs implement program input/output, panic handling, and program termination in different ways. This document also describes these methods because they are related to guest program memory layouts. ## Jolt Sources: - https://jolt.a16zcrypto.com/how/architecture/ram.html#address-remapping - https://github.com/a16z/jolt/blob/v0.2.1-alpha/common/src/constants.rs - https://github.com/a16z/jolt/blob/v0.2.1-alpha/jolt-core/src/host/program.rs - https://github.com/a16z/jolt/blob/v0.2.1-alpha/jolt-core/src/host/mod.rs - https://github.com/a16z/jolt/blob/v0.2.1-alpha/common/src/jolt_device.rs Compilation target: `riscv32im` | Section | Start | Size | Access | | ------------ |---------------- | --------------------------- |---------| | **Program I/O** | | | | | `input` | `0x80000000 - io_size` | `max input size` | `ro` | | `output` | `.` | `max output size` | `wo` | | **ZKVM data** | | | | | `panic flag` | `.` | `4 bytes` | `rw` | | `termination flag`| `.` | `4 bytes` | `wo` | | `zero padding?` | `.` | up to `0x7fffffff` | | | **Code** | | `code size` (max 128MB) | | | `.text.boot` | `0x80000000` | `section size` | `ro` | | `.text` | `.` | `section size` | `ro` | | `.data` | `.` | `section size` | `rw` | | `.bss` | `.` | `section size` | `rw` | | **Stack** | `. ALIGN(8)` | `STACK_SIZE (4kb default)` | `rw` | | **Heap** | `. ALIGN(8)` | `heap size` | `rw` | ### Comments - Program input and output occupy a dedicated memory space at the beginning of the loaded binary virtual memory space. - `panic` and `termination` flags are used to terminate the guest program. The values at these memory addresses are checked instantly by the VM, and if any of them is set to `1`, the program is terminated. - `code size` must have a maximum of `128MB` (`EMULATOR_MEMORY_CAPACITY`). - `STACK_SIZE + Heap size` must have a maximum of `32MB` by default. It can be customized with the `jolt::provable` macro. For example `jolt::provable(stack_size = 10000, memory_size = 10000000)` - Maximum size of Program I/O sections and stack size can be customized with the `jolt::provable` macro. - `io_size` is a size of input, output, panic and termination flags. ## Nexus Sources: - https://docs.nexus.xyz/zkvm/specifications/arch - https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/sdk/src/compile/linker-scripts/default.x - https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/runtime/src/runtime.rs#L61-L109 - https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/runtime/src/alloc.rs#L20-L76 - https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/vm/src/emulator/layout.rs#L39 Compilation target: `riscv32i` | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | **ZKVM data** | | | | | `rv32 registers` | `0x00` | `128 bytes` | `rw` | | `public input offset` | `0x80` | `4 bytes` | `ro` | | `public output offset`| `0x84` | `4 bytes` | `wo` | | **Code** | | | | | `.text` | `0x88` | `.text size` | `ro` | | `.srodata` | `.` | `.srodata size` | `ro` | | `.rodata` | `.` | `.rodata size` | `ro` | | `.sdata` | `.` | `.sdata size` | `rw` | | `.data` | `.` | `.data size` | `rw` | `.sbss` | `.` | `.sbss size` | `rw` | | `.bss` | `.` | `.bss size` | `rw` | | **Program I/O (input)** | | | | | `input` | `.` | `input size` | `ro` | | `exit code` | `.` | `4 bytes` | `wo` | | `output` | `.` | `output size` | `wo` | | **Heap** | `. ALIGN(4) (heap grows up)`| `heap size` | `rw` | | **Stack** | `0x80400000 (stack grows down)`| `stack size` | `rw` | | **ZKVM data** | | | | Associated data (AD) | `0x80400000` | `AD size` | `na` | ### Comments - Program I/O occupies a dedicated memory space right after the actual program code. The location of the data depends on program size, so offset values had to be introduced. - Program termination is implemented via system `ecall`. The exit code is taken from a dedicated memory location. - AD contains contextual information about execution or the context of proving. This can be, for example, a hash of the program file in high-level language. - The whole binary cannot use more than `0x80400000` bytes (2GB + 4MB). - Important information that clarifies how the program is executed in two separate passes to limit memory usage: ["the machine architecture operates on a two-pass tracing model: the program is first executed in a (mostly) traditional Harvard architecture, and statistics are kept as to the resultant memory usage. The guest program is then executed again using the same inputs in a modified Harvard architecture with a fixed-memory organization determined from the statistics of the first execution, which is more conducive to proving."](https://docs.nexus.xyz/zkvm/specifications/arch#execution-model) ## OpenVM Sources: - https://docs.openvm.dev/specs/reference/rust-frontend#guest-runtime - https://docs.openvm.dev/specs/openvm/isa#virtual-machine-state - https://github.com/openvm-org/openvm/blob/v1.4.0/crates/toolchain/build/src/lib.rs#L291 - https://github.com/openvm-org/openvm/blob/v1.4.0/crates/toolchain/platform/src/memory.rs Compilation target: `riscv32im` | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | **Stack** | `0x00200400 (stack grows down)`| | `rw` | **Code** | `0x00200800` | `code size` | | | `.text` | `0x00200800` | `.text size` | `ro` | | `.rodata` | `.` | `.rodata size` | `ro` | | `.eh_frame` | `.` | `.eh_frame size` | `ro` | | `.bss` | `.` | `.bss size` | `rw` | | **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` | ### Comments - All guest program memory cannot exceed `0x20000000` bytes (500MB). - `GUEST_MIN_MEM=0x00000400` Below this address there is probably some reserved space. This is not yet clear from the documentation. - Program input and output are stored in a heap memory section according to program logic. - Program termination is implemented via custom RISC-V instruction. - RISC-V ELF binary is transpiled into OpenVM executable. Registers and memory accesses are transliped to different [OpenVM memory address spaces](https://docs.openvm.dev/specs/openvm/isa#address-spaces) access. For example registers values are mapped into `1` address space and user memory access to address space `2`. ## Pico Compilation target: `riscv32im` Sources: - https://github.com/brevis-network/pico/blob/v1.1.7/sdk/sdk/src/riscv_ecalls/memory.rs - https://github.com/brevis-network/pico/blob/v1.1.7/vm/src/emulator/riscv/emulator/mod.rs#L1294 | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | **Stack** | `0x00200400 (stack grows down)`| TODO | `rw` | | **Code** | `0x00200800` | `code size` | | | `.text` | `0x00200800` | `.text size` | `ro` | | `.rodata` | `.` | `.rodata size` | `ro` | | `.eh_frame` | `.` | `.eh_frame size` | `ro` | | `.bss` | `.` | `.bss size` | `rw` | | **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` | ### Comments - All guest program memory cannot exceed `0x78000000` bytes (1920MB). - Similar to OpenVM except for maximum memory size. - Program input and output are stored in a heap memory section according to program logic. - Program termination is implemented via system `ecall`. - Registers values are stored in the beginning of memory space. ## Risc0 Compilation target: `riscv32im` Sources: - https://dev.risczero.com/api/zkvm/zkvm-specification#zkvm-memory-layout - https://github.com/risc0/risc0/blob/v3.0.3/risc0/zkvm/platform/src/memory.rs - https://github.com/risc0/risc0/blob/v3.0.3/risc0/zkvm/platform/src/rust_rt.rs - https://github.com/risc0/risc0/blob/v3.0.3/risc0/build/src/lib.rs - https://github.com/risc0/risc0/blob/v3.0.3/risc0/circuit/rv32im/src/execute/platform.rs#L27-L53 | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | Invalid page | `0x00000000` | `0xFFFF (64KiB)`| `na` | | **Stack** | `0x00200400 (stack grows down)`| | `rw` | | **Code** | `0x00200800` | `code size` | | | `.text` | `0x00200800` | `.text size` | `ro` | | `.rodata` | `.` | `.rodata size` | `ro` | | `.eh_frame` | `.` | `.eh_frame size` | `ro` | | `.bss` | `.` | `.bss size` | `rw` | | **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` | | **ZKVM data** | | | | | `kernel memory` (Contains kernel code and data such as ecall/trap dispatch, register contents) | `0xc0000000` | `0x3F000000 (~1GiB)` | It depends on purpose of specific memory region. | ### Comments - `GUEST_MIN_MEM=0x00004000` Below this address there is probably some reserved space. This is not yet clear from the documentation. Changed from `0x400` to `0x4000` together with growing guest memory (https://github.com/risc0/risc0/pull/2866). The stack top remained unchanged. - All guest program memory cannot exceed `0xC0000000` bytes (3GB) minus `GUEST_MIN_MEM`. - Similar to OpenVM except for maximum memory size. - Program input and output are stored in a heap memory section according to program logic. - Program termination is implemented via system `ecall`. ## Sp1 Compilation target: `riscv32im` Sources: - https://docs.succinct.xyz/docs/sp1/security/rv32im-implementation#reserved-memory-regions - https://github.com/succinctlabs/sp1/blob/main/crates/zkvm/entrypoint/src/syscalls/memory.rs - https://github.com/succinctlabs/sp1/blob/main/crates/build/src/command/utils.rs#L53-L81 | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | **ZKVM dat** | | | | | `registers` | `0x00` | `0x80 (128 bytes)` | `rw` | | **Stack** | `0x00200400 (stack grows down)`| | `rw` | | **Code** | | | | | `.text` | `0x00200800` | `.text size` | `ro` | | `.rodata` | `.` | `.rodata size` | `ro` | | `.eh_frame` | `.` | `.eh_frame size` | `ro` | | `.bss` | `.` | `.bss size` | `rw` | | **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` | ### Comments - All guest program memory cannot exceed `0x78000000` bytes (1920MB). - Similar to OpenVM except for maximum memory size. - Program input and output are stored in a heap memory section according to program logic. - Program termination is implemented via system `ecall`. ## Zisk Sources: - https://github.com/0xPolygonHermez/zisk/blob/main/core/src/mem.rs#L3 - https://github.com/0xPolygonHermez/rust/blob/zisk/compiler/rustc_target/src/spec/targets/riscv64ima_zisk_zkvm_elf_linker_script.ld Compilation target: `riscv64ima` | Section | Start | Size | Access | | ------------ | ----------------| --------------------------- | ------ | | **Code** (BIOS) | | | | | `basic setup ROM_ENTRY` | `0x1000` | `4 bytes` | `ro` | | `last instruction ROM_EXIT` | `0x1004` | `4 bytes` | `ro` | | `init and finalize code` | `0x1008` | `init/fin code size` | `ro` | | **Code** (.text, .rodata)| `0x80000000` | `0x08000000 (128MB)` | `ro` | | **Program I/O** input | `0x90000000` | `0x08000000 (128MB)` | `ro` | | **ZKVM data** | | | | | `registers` | `0xa0000000` | `256 bytes` | `rw` | | `UART_ADDR` |`0xa0000200`|`1 byte` | `wo` | | **Program I/O** output |`0xa0010000` |`0x00010000 (64KB)` | `wo` | | **Code** (.data, .bss) | `0xa0020000` | `.data + .bss size` | `rw` | | **Stack** | `. ALIGN(8)` | `0x100000 (1MB)` | `rw` | | **Heap** | `.` | `heap size` | `rw` | ### Comments - Execution always starts at `ROM_ENTRY` and finishes at `ROM_EXIT`. Before the main program, a setup step is performed. - Any single byte written to `UART_ADDR` goes to standard output. - Setup and finalization instructions are located in low addresses (BIOS), while the actual program instructions are located in high addresses. - After the actual program execution, a finalization step is performed. ## Summary - All ZKVMs reserve memory space for ZKVM implementation-specific data like registers, flags, etc. - Stack and heap space are usually at the end of memory space. - Most ZKVMs implement program termination as: - an environment call (`ecall`). - custom RISC-V instruction - dedicated flag stored in memory. - There are two different approaches to handling program I/O: - via `ecall`s - using dedicated memory space for program input and output.