# ZKVM Guest Program Memory Layout
The purpose of this document is to present different approaches to guest program memory layout implemented by ZKVM projects and to start a discussion about a common [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) level memory layout used by all ZKVMs.
The current status of memory layout across different ZKVM implementations requires guest program recompilation with ZKVM-specific memory layouts defined by custom linker scripts or linker flags.
## Before We Start
All described ZKVM guest program binaries are in [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) format and the compilation target is always [RISC-V](https://en.wikipedia.org/wiki/RISC-V) with different extension sets. They are all some combination of I, M, and A extensions. The detailed compilation target for each case is defined below.
Before getting into details, we need to define the memory section types that a guest binary program consists of. Each program after loading occupies some memory space which is split into several different section types depending on their role. We can define them as:
- **Program I/O** section is used by the ZKVM to pass program input data and receive program output.
- **ZKVM data** section contains ZKVM implementation-specific data. It can be, for example, offsets in memory to some specific data, runtime flags, or RISC-V register values.
- **Code** section consists of any ELF section which occupies memory during execution. They can be:
- `.text`, `.init` - executable code section, read-only
- `.rodata` - read-only arbitrary data
- `.data`, `.bss` - arbitrary data
- Others depending on ZKVM implementation.
- **Stack** - memory section reserved for program stack.
- **Heap** - memory section reserved for heap memory.
Each memory layout is described in a table with virtual memory start address and size of the section. `.` means "end of last section".
`ALIGN(k)` means that the section start is aligned on a `k` byte boundary in memory.
Memory access modes are:
- `ro` - read only
- `wo` - write only
- `rw` - read or write
- `na` - no access
It is worth noting that different ZKVMs implement program input/output, panic handling, and program termination in different ways. This document also describes these methods because they are related to guest program memory layouts.
## Jolt
Sources:
- https://jolt.a16zcrypto.com/how/architecture/ram.html#address-remapping
- https://github.com/a16z/jolt/blob/v0.2.1-alpha/common/src/constants.rs
- https://github.com/a16z/jolt/blob/v0.2.1-alpha/jolt-core/src/host/program.rs
- https://github.com/a16z/jolt/blob/v0.2.1-alpha/jolt-core/src/host/mod.rs
- https://github.com/a16z/jolt/blob/v0.2.1-alpha/common/src/jolt_device.rs
Compilation target: `riscv32im`
| Section | Start | Size | Access |
| ------------ |---------------- | --------------------------- |---------|
| **Program I/O** | | | |
| `input` | `0x80000000 - io_size` | `max input size` | `ro` |
| `output` | `.` | `max output size` | `wo` |
| **ZKVM data** | | | |
| `panic flag` | `.` | `4 bytes` | `rw` |
| `termination flag`| `.` | `4 bytes` | `wo` |
| `zero padding?` | `.` | up to `0x7fffffff` | |
| **Code** | | `code size` (max 128MB) | |
| `.text.boot` | `0x80000000` | `section size` | `ro` |
| `.text` | `.` | `section size` | `ro` |
| `.data` | `.` | `section size` | `rw` |
| `.bss` | `.` | `section size` | `rw` |
| **Stack** | `. ALIGN(8)` | `STACK_SIZE (4kb default)` | `rw` |
| **Heap** | `. ALIGN(8)` | `heap size` | `rw` |
### Comments
- Program input and output occupy a dedicated memory space at the beginning of the loaded binary virtual memory space.
- `panic` and `termination` flags are used to terminate the guest program. The values at these memory addresses are checked instantly by the VM, and if any of them is set to `1`, the program is terminated.
- `code size` must have a maximum of `128MB` (`EMULATOR_MEMORY_CAPACITY`).
- `STACK_SIZE + Heap size` must have a maximum of `32MB` by default. It can be customized with the `jolt::provable` macro. For example `jolt::provable(stack_size = 10000, memory_size = 10000000)`
- Maximum size of Program I/O sections and stack size can be customized with the `jolt::provable` macro.
- `io_size` is a size of input, output, panic and termination flags.
## Nexus
Sources:
- https://docs.nexus.xyz/zkvm/specifications/arch
- https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/sdk/src/compile/linker-scripts/default.x
- https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/runtime/src/runtime.rs#L61-L109
- https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/runtime/src/alloc.rs#L20-L76
- https://github.com/nexus-xyz/nexus-zkvm/blob/v0.3.4/vm/src/emulator/layout.rs#L39
Compilation target: `riscv32i`
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| **ZKVM data** | | | |
| `rv32 registers` | `0x00` | `128 bytes` | `rw` |
| `public input offset` | `0x80` | `4 bytes` | `ro` |
| `public output offset`| `0x84` | `4 bytes` | `wo` |
| **Code** | | | |
| `.text` | `0x88` | `.text size` | `ro` |
| `.srodata` | `.` | `.srodata size` | `ro` |
| `.rodata` | `.` | `.rodata size` | `ro` |
| `.sdata` | `.` | `.sdata size` | `rw` |
| `.data` | `.` | `.data size` | `rw` |
`.sbss` | `.` | `.sbss size` | `rw` |
| `.bss` | `.` | `.bss size` | `rw` |
| **Program I/O (input)** | | | |
| `input` | `.` | `input size` | `ro` |
| `exit code` | `.` | `4 bytes` | `wo` |
| `output` | `.` | `output size` | `wo` |
| **Heap** | `. ALIGN(4) (heap grows up)`| `heap size` | `rw` |
| **Stack** | `0x80400000 (stack grows down)`| `stack size` | `rw` |
| **ZKVM data** | | |
| Associated data (AD) | `0x80400000` | `AD size` | `na` |
### Comments
- Program I/O occupies a dedicated memory space right after the actual program code. The location of the data depends on program size, so offset values had to be introduced.
- Program termination is implemented via system `ecall`. The exit code is taken from a dedicated memory location.
- AD contains contextual information about execution or the context of proving. This can be, for example, a hash of the program file in high-level language.
- The whole binary cannot use more than `0x80400000` bytes (2GB + 4MB).
- Important information that clarifies how the program is executed in two separate passes to limit memory usage: ["the machine architecture operates on a two-pass tracing model: the program is first executed in a (mostly) traditional Harvard architecture, and statistics are kept as to the resultant memory usage. The guest program is then executed again using the same inputs in a modified Harvard architecture with a fixed-memory organization determined from the statistics of the first execution, which is more conducive to proving."](https://docs.nexus.xyz/zkvm/specifications/arch#execution-model)
## OpenVM
Sources:
- https://docs.openvm.dev/specs/reference/rust-frontend#guest-runtime
- https://docs.openvm.dev/specs/openvm/isa#virtual-machine-state
- https://github.com/openvm-org/openvm/blob/v1.4.0/crates/toolchain/build/src/lib.rs#L291
- https://github.com/openvm-org/openvm/blob/v1.4.0/crates/toolchain/platform/src/memory.rs
Compilation target: `riscv32im`
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| **Stack** | `0x00200400 (stack grows down)`| | `rw`
| **Code** | `0x00200800` | `code size` | |
| `.text` | `0x00200800` | `.text size` | `ro` |
| `.rodata` | `.` | `.rodata size` | `ro` |
| `.eh_frame` | `.` | `.eh_frame size` | `ro` |
| `.bss` | `.` | `.bss size` | `rw` |
| **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` |
### Comments
- All guest program memory cannot exceed `0x20000000` bytes (500MB).
- `GUEST_MIN_MEM=0x00000400` Below this address there is probably some reserved space. This is not yet clear from the documentation.
- Program input and output are stored in a heap memory section according to program logic.
- Program termination is implemented via custom RISC-V instruction.
- RISC-V ELF binary is transpiled into OpenVM executable. Registers and memory accesses are transliped to different [OpenVM memory address spaces](https://docs.openvm.dev/specs/openvm/isa#address-spaces) access. For example registers values are mapped into `1` address space and user memory access to address space `2`.
## Pico
Compilation target: `riscv32im`
Sources:
- https://github.com/brevis-network/pico/blob/v1.1.7/sdk/sdk/src/riscv_ecalls/memory.rs
- https://github.com/brevis-network/pico/blob/v1.1.7/vm/src/emulator/riscv/emulator/mod.rs#L1294
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| **Stack** | `0x00200400 (stack grows down)`| TODO | `rw` |
| **Code** | `0x00200800` | `code size` | |
| `.text` | `0x00200800` | `.text size` | `ro` |
| `.rodata` | `.` | `.rodata size` | `ro` |
| `.eh_frame` | `.` | `.eh_frame size` | `ro` |
| `.bss` | `.` | `.bss size` | `rw` |
| **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` |
### Comments
- All guest program memory cannot exceed `0x78000000` bytes (1920MB).
- Similar to OpenVM except for maximum memory size.
- Program input and output are stored in a heap memory section according to program logic.
- Program termination is implemented via system `ecall`.
- Registers values are stored in the beginning of memory space.
## Risc0
Compilation target: `riscv32im`
Sources:
- https://dev.risczero.com/api/zkvm/zkvm-specification#zkvm-memory-layout
- https://github.com/risc0/risc0/blob/v3.0.3/risc0/zkvm/platform/src/memory.rs
- https://github.com/risc0/risc0/blob/v3.0.3/risc0/zkvm/platform/src/rust_rt.rs
- https://github.com/risc0/risc0/blob/v3.0.3/risc0/build/src/lib.rs
- https://github.com/risc0/risc0/blob/v3.0.3/risc0/circuit/rv32im/src/execute/platform.rs#L27-L53
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| Invalid page | `0x00000000` | `0xFFFF (64KiB)`| `na` |
| **Stack** | `0x00200400 (stack grows down)`| | `rw` |
| **Code** | `0x00200800` | `code size` | |
| `.text` | `0x00200800` | `.text size` | `ro` |
| `.rodata` | `.` | `.rodata size` | `ro` |
| `.eh_frame` | `.` | `.eh_frame size` | `ro` |
| `.bss` | `.` | `.bss size` | `rw` |
| **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` |
| **ZKVM data** | | | |
| `kernel memory` (Contains kernel code and data such as ecall/trap dispatch, register contents) | `0xc0000000` | `0x3F000000 (~1GiB)` | It depends on purpose of specific memory region. |
### Comments
- `GUEST_MIN_MEM=0x00004000` Below this address there is probably some reserved space. This is not yet clear from the documentation. Changed from `0x400` to `0x4000` together with growing guest memory (https://github.com/risc0/risc0/pull/2866). The stack top remained unchanged.
- All guest program memory cannot exceed `0xC0000000` bytes (3GB) minus `GUEST_MIN_MEM`.
- Similar to OpenVM except for maximum memory size.
- Program input and output are stored in a heap memory section according to program logic.
- Program termination is implemented via system `ecall`.
## Sp1
Compilation target: `riscv32im`
Sources:
- https://docs.succinct.xyz/docs/sp1/security/rv32im-implementation#reserved-memory-regions
- https://github.com/succinctlabs/sp1/blob/main/crates/zkvm/entrypoint/src/syscalls/memory.rs
- https://github.com/succinctlabs/sp1/blob/main/crates/build/src/command/utils.rs#L53-L81
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| **ZKVM dat** | | | |
| `registers` | `0x00` | `0x80 (128 bytes)` | `rw` |
| **Stack** | `0x00200400 (stack grows down)`| | `rw` |
| **Code** | | | |
| `.text` | `0x00200800` | `.text size` | `ro` |
| `.rodata` | `.` | `.rodata size` | `ro` |
| `.eh_frame` | `.` | `.eh_frame size` | `ro` |
| `.bss` | `.` | `.bss size` | `rw` |
| **Heap** (contains program I/O) | `. ALIGN(4)` | `heap size` | `rw` |
### Comments
- All guest program memory cannot exceed `0x78000000` bytes (1920MB).
- Similar to OpenVM except for maximum memory size.
- Program input and output are stored in a heap memory section according to program logic.
- Program termination is implemented via system `ecall`.
## Zisk
Sources:
- https://github.com/0xPolygonHermez/zisk/blob/main/core/src/mem.rs#L3
- https://github.com/0xPolygonHermez/rust/blob/zisk/compiler/rustc_target/src/spec/targets/riscv64ima_zisk_zkvm_elf_linker_script.ld
Compilation target: `riscv64ima`
| Section | Start | Size | Access |
| ------------ | ----------------| --------------------------- | ------ |
| **Code** (BIOS) | | | |
| `basic setup ROM_ENTRY` | `0x1000` | `4 bytes` | `ro` |
| `last instruction ROM_EXIT` | `0x1004` | `4 bytes` | `ro` |
| `init and finalize code` | `0x1008` | `init/fin code size` | `ro` |
| **Code** (.text, .rodata)| `0x80000000` | `0x08000000 (128MB)` | `ro` |
| **Program I/O** input | `0x90000000` | `0x08000000 (128MB)` | `ro` |
| **ZKVM data** | | | |
| `registers` | `0xa0000000` | `256 bytes` | `rw` |
| `UART_ADDR` |`0xa0000200`|`1 byte` | `wo` |
| **Program I/O** output |`0xa0010000` |`0x00010000 (64KB)` | `wo` |
| **Code** (.data, .bss) | `0xa0020000` | `.data + .bss size` | `rw` |
| **Stack** | `. ALIGN(8)` | `0x100000 (1MB)` | `rw` |
| **Heap** | `.` | `heap size` | `rw` |
### Comments
- Execution always starts at `ROM_ENTRY` and finishes at `ROM_EXIT`. Before the main program, a setup step is performed.
- Any single byte written to `UART_ADDR` goes to standard output.
- Setup and finalization instructions are located in low addresses (BIOS), while the actual program instructions are located in high addresses.
- After the actual program execution, a finalization step is performed.
## Summary
- All ZKVMs reserve memory space for ZKVM implementation-specific data like registers, flags, etc.
- Stack and heap space are usually at the end of memory space.
- Most ZKVMs implement program termination as:
- an environment call (`ecall`).
- custom RISC-V instruction
- dedicated flag stored in memory.
- There are two different approaches to handling program I/O:
- via `ecall`s
- using dedicated memory space for program input and output.