owned this note
owned this note
Published
Linked with GitHub
# Rust compiler changes by zkvm projects
This document sums up changes made by zkvm teams to [the rust compiler](https://github.com/rust-lang/rust/) to support zkevm bytecode generation.
Changes focus on defining new compilation target and reimplementation of some important system calls.
## [risc0/rust](https://github.com/risc0/rust/)
Rust compiler version: [`1.88.0`](https://github.com/risc0/rust/blob/r0.1.88.0)
[Diff](https://github.com/rust-lang/rust/compare/1.88.0...risc0:rust:r0.1.88.0) to the original rust compiler
**Comment**:
All risc0 changes were **upstreamed** to `rust-lang` repo
### Definition of the new compilation target
https://github.com/risc0/rust/blob/r0.1.88.0/compiler/rustc_target/src/spec/targets/riscv32im_risc0_zkvm_elf.rs
### System calls related changes
#### System panic handling (`__rust_start_panic`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/panic_abort/src/lib.rs#L37
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/panic_abort/src/zkvm.rs#L6
#### Memory allocation (`alloc` `dealloc`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/alloc/mod.rs#L93
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/alloc/zkvm.rs
#### Command line arguments (`Args`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/args/mod.rs#L38
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/args/zkvm.rs
#### Reading system env variables (`getenv`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/env/mod.rs#L42-L41
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/env/zkvm.rs
#### Random bytes generator (`fill_bytes`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/random/mod.rs#L75-L77
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/random/zkvm.rs#L3
#### Standard io (`Stdin` `Stdout` `Stderr`)
Usage https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/stdio/mod.rs#L34-L37
Implementation https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/stdio/zkvm.rs
#### Unimplemented or unsupported
https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/pal/zkvm/os.rs
Unimplemented:
`errno` - returns `0`
`error_string` - returns `operation successful`
Unsupported by `zkvm` risc0 platform:
`getcwd`
`chdir`
`split_paths`
`current_exe`
`temp_dir`
`home_dir` - returns `None`
`exit` - aborts
`getpid`
### Summary
The risc0 changes to the rust compiler are very limited and focused on supporting of basic system calls. Less than 10 system calls had to be customized for this zkvm. The rest of system calls mainly related to file system interaction are not needed.
## [succinctlabs/rust](https://github.com/succinctlabs/rust)
### version 1.81.0
sp1 uses risc0 rust compiler version `r0.1.81.0` (the latest release) https://github.com/succinctlabs/rust/tree/v1.81.0 slightly modified.
[Full diff](https://gist.github.com/rodiazet/663fbb5e57efde7def4026182ea37b1c).
Comment:
It is not a regular fork of rust compiler so cannot compare it via github.
#### `hashmap_random_keys` modification
sp1 [modifies](https://gist.github.com/rodiazet/663fbb5e57efde7def4026182ea37b1c#file-r0-1-81-0-sp1-v1-81-0-diff-L584-L617) slightly the implementation of `hashmap_random_keys` function.
### Version in branch [`succinct-1.88.0`](https://github.com/succinctlabs/rust/blob/succinct-1.88.0)
A couple of the system call implementations are changed comparing to `1.88.0` version of risc0.
[Full diff](https://gist.github.com/rodiazet/f71bb516b531b2e3dcc05362a8ef95f5).
### Smaller zkvm ABI set
Comparing to [risc0](https://github.com/risc0/rust/blob/r0.1.88.0/library/std/src/sys/pal/zkvm/abi.rs), [sp1 ABI functions set](https://github.com/succinctlabs/rust/blob/succinct-1.88.0/library/std/src/sys/args/zkvm.rs) is much smaller.
Removed functions:
`sys_halt`
`sys_output`
`sys_sha_compress`
`sys_sha_buffer`
`sys_log`
`sys_cycle_count`
`sys_read`
`sys_argc`
`sys_argv`
**TODO**: Don't know yet why and what is an intention of these changes.
#### Unsupported command line arguments (`Args`)
Reading [command line](https://github.com/succinctlabs/rust/blob/succinct-1.88.0/library/std/src/sys/args/zkvm.rs) arguments is unsupported. Result of removing ABI functions listed above.
#### Unsupported standard input reading. (`Stdin`)
[Reading `Stdin`](https://github.com/succinctlabs/rust/blob/succinct-1.88.0/library/std/src/sys/args/zkvm.rs) is unsupported as a result of above changes in zkvm ABI.
### Summary
The latest release of sp1 rust fork (1.81.0) is almost the same as risc0 fork for the same version. For this version zkvm ABI looks exactly the same and sp1 rust fork can be probably easily replaced by risc0 fork or original rust-lang repo implementation.
The 1.88.0 version is probably some work in progress on ABI changing but it's not clear ATM.
## [jolt/rust](https://github.com/a16z/rust/)
This is also not a regular fork but the newest version in `jolt/1.86.0` branch is closest to [rust 1.86.0](https://github.com/rust-lang/rust/tree/1.86.0).
[Full diff](https://gist.github.com/rodiazet/890ecf9f7a7dc09ac92fd4f449c258da)
### New compilation target definition
It looks the same as for risc0.
https://github.com/a16z/rust/blob/jolt/1.86.0/compiler/rustc_target/src/spec/targets/riscv32im_jolt_zkvm_elf.rs
### System calls related changes
jolt removes most of zkvm syscalls in [ABI definition](https://github.com/a16z/rust/blob/jolt/1.86.0/library/std/src/sys/pal/zkvm/abi.rs). It uses only two of them
`jolt_panic` and `sys_alloc`.
#### System panic handling (`__rust_start_panic`)
jolt uses no-argument `jolt_panic` function comparing to risc0. The panic massage is not supported.
https://github.com/a16z/rust/blob/jolt/1.86.0/library/panic_abort/src/zkvm.rs
#### Memory allocation (`alloc` `dealloc`)
Looks same as for risc0 with minimal changes.
https://github.com/a16z/rust/blob/jolt/1.86.0/library/panic_abort/src/zkvm.rs
## [zisk/rust](https://github.com/0xPolygonHermez/rust)
Differences between [zisk](https://github.com/0xPolygonHermez/rust/tree/zisk-0.3.0) 0.3.0(1.85.1) and [rust compiler](https://github.com/rust-lang/rust/tree/1.85.1) v1.85.1
[Full diff](https://gist.github.com/rodiazet/7fb15a30ec29d3671e87dfcef05c076a)
### Definition of the new compilation target
Very similar to risk0 compilation target with minor modifications.
https://github.com/0xPolygonHermez/rust/blob/zisk-0.3.0/compiler/rustc_target/src/spec/targets/riscv32ima_zisk_zkvm_elf.rs
Zisk also defines 64bit target.
https://github.com/0xPolygonHermez/rust/blob/zisk-0.3.0/compiler/rustc_target/src/spec/targets/riscv64ima_zisk_zkvm_elf.rs
Important comment:
Moreover they extend instruction set with A (Standard Extension for Atomic Instructions) extension. Additional 11 instructions.
### System calls related changes
All changes are the same or very similar to risc0 version but moved to `zisk.rs` instead of `zkvm.rs`.
#### System panic handling (`__rust_start_panic`)
Zisk just aborts on panic without panic message support.
https://github.com/0xPolygonHermez/rust/blob/zisk-0.3.0/library/panic_abort/src/lib.rs#L40
#### Random bytes generator (`fill_bytes`)
zisk defines slightly different `sys_rand` function first argument type. They use `recv_buf: *mut u8` when risc0 `recv_buf: *mut u32`. Which probably makes more sense in this case as this function is supposed to return random bytes array.
This results in simpler [implementation](https://github.com/0xPolygonHermez/rust/blob/zisk-0.3.0/library/std/src/sys/random/zisk.rs) of `fill_bytes` function from standard library than its [risc0 counterpart](https://github.com/0xPolygonHermez/rust/blob/zisk-0.3.0/library/std/src/sys/random/zkvm.rs).
## [openvm](https://github.com/openvm-org/openvm)
They use the same [rust toolchain](https://github.com/openvm-org/openvm/tree/main?tab=readme-ov-file#acknowledgements) as risc0 which is already merged to rust compiler.
## [pico](https://github.com/brevis-network/pico)
Same as openvm.
TODO: Verify this.
# Bytecode comparison for `zkvm` and plain `riscv32im` target.
`riscv32im_unknown_none_elf` definition:
``` rust
Target {
data_layout: "e-m:e-p:32:32-i64:64-n32-S128".into(),
llvm_target: "riscv32".into(),
metadata: TargetMetadata {
description: None,
tier: Some(2),
host_tools: Some(false),
std: Some(false),
},
pointer_width: 32,
arch: "riscv32".into(),
options: TargetOptions {
linker_flavor: LinkerFlavor::Gnu(Cc::No, Lld::Yes),
linker: Some("rust-lld".into()),
cpu: "generic-rv32".into(),
max_atomic_width: Some(32),
atomic_cas: false,
features: "+m,+forced-atomics".into(),
llvm_abiname: "ilp32".into(),
panic_strategy: PanicStrategy::Abort,
relocation_model: RelocModel::Static,
emit_debug_gdb_scripts: false,
eh_frame_header: false,
..Default::default()
},
}
```
## risc0
Target definition:
```rust
Target {
data_layout: "e-m:e-p:32:32-i64:64-n32-S128".into(),
llvm_target: "riscv32".into(),
metadata: TargetMetadata {
description: Some("RISC Zero's zero-knowledge Virtual Machine (RV32IM ISA)".into()),
tier: Some(3),
host_tools: Some(false),
std: None, // ?
},
pointer_width: 32,
arch: "riscv32".into(),
options: TargetOptions {
os: "zkvm".into(),
vendor: "risc0".into(),
linker_flavor: LinkerFlavor::Gnu(Cc::No, Lld::Yes),
linker: Some("rust-lld".into()),
cpu: "generic-rv32".into(),
// Some crates (*cough* crossbeam) assume you have 64 bit
// atomics if the target name is not in a hardcoded list.
// Since zkvm is singlethreaded and all operations are
// atomic, I guess we can just say we support 64-bit
// atomics.
max_atomic_width: Some(64),
atomic_cas: true,
features: "+m".into(),
llvm_abiname: "ilp32".into(),
executables: true,
panic_strategy: PanicStrategy::Abort,
relocation_model: RelocModel::Static,
emit_debug_gdb_scripts: false,
eh_frame_header: false,
singlethread: true,
..Default::default()
},
}
```
Important differences:
| feature | riscv32im_unknown_none_elf | riscv32im_risc0_zkvm_elf |
| -------- | -------- | -------- |
| atomics | `forced-atomics`* | `lower-atomic`** |
| atomic CAS support| false | true |
| max_atomic_width | 32 | 64 |
| std | false | None //? |
| singlethread | false(default) | true |
| executables | false(default) | true |
| Tier | 2 | 3 |
\* Ref: https://llvm.org/docs/Atomics.html#libcalls-sync
\** Ref: https://llvm.org/docs/Passes.html#lower-atomic-lower-atomic-intrinsics-to-non-atomic-form
Standard library cannot be built to `riscv32im_unknown_none_elf`. Impossible to compare result bytecodes. The `hello-world` example does not use `std` but unfortunately one of its dependency (`downcast-rs`) does so building std is necessary.
# Guest program entry point.
In rust you need to use standard library to define proper program entry point or you have to define `#[start]` function by your own.
## risc0
To avoid std library dependency risc0 introduced their own entry point definition with [`risc0_zkvm::guest::entry!`](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/src/guest/mod.rs#L113-L147). It's needed only when `#![no_std]` and `#![no_main]` macros are used. Without them the regular main function can be used as program entry point and `start` function is linked automatically.
Some examples:
[Keccak example](https://github.com/risc0/risc0/blob/v2.2.0/examples/keccak/methods/guest/src/bin/keccak.rs#L19) uses std library to define the program entry point.
[Hello world](https://github.com/risc0/risc0/blob/v2.2.0/examples/hello-world/methods/guest/src/main.rs#L15-L22) does not use std library and defines entry point by the `risc0_zkvm::guest::entry!` macro.
To compare what are the bytecodes differences there are [three very simple example programs](https://github.com/rodiazet/risc0/tree/add-simplest-guest-programs) prepared for three different scenarios.
1. [No risc0 entry point](https://gist.github.com/rodiazet/da6338acf3a678739857e417a57086bf) and initial steps (stack pointer initialization, heap start and position initialization and higher level operation `env::init` and `env::finalize` after `main` function calling)
2. Usage of [risc0 `entry`](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63) macro but without std library.
3. Risc0 example [with std library](https://gist.github.com/rodiazet/a8105002340cf12ebb86d8899349f374) usage.
We also compiled 1. to [`riscv32im-unknown-none-elf`](https://gist.github.com/rodiazet/5557050d1333850df5abfa3602459866). Result bytecode looks very similar to `riscv32im-risc0-zkvm-elf` target result. The only difference is in the program bytecode location in memory.
The second case bytecode is much more complicated. Program starts as usual from entry point defined by [`_start`](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L370) function. Then some additional steps are made before calling a used defined `main` function.
1. [Initialize stack pointer](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L376) to [`STACK_TOP`](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/platform/src/memory.rs#L19) in `sp` register.
2. [Call](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L379) additional [`__start`](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L880) function which wraps `main` call.
3. [Initialize](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L886) `HEAP_START` which points to the first byte after elf section. [Does that same for `HEAP_POS`](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L888). There two different allocators' implementations. We use the [`bump` allocator](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/platform/src/heap/bump.rs) in this case.
4. [Call env::init](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L890) defined [here](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/src/guest/env/mod.rs#L142-L153)
5. Call a used defined [`main` function](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L892).
6. [Call `env::finalize`](https://gist.github.com/rodiazet/76557927efa1ad35b531c54d4dee2f63#file-simples-risc0-rv32im-risc0-zkvm-elf-dis-L896) defined [here](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/src/guest/env/mod.rs#L156-L176)
## Notes
1. It's impossible to run `strace` on risc0 guest program compiled to any other target than `riscv32im-risc0-zkvm-elf` and check called linux system calls because risc0 syscalls [are not implemented](https://github.com/risc0/risc0/blob/v2.2.0/risc0/zkvm/platform/src/syscall.rs#L255) for other targets and running such a program results in panic (`unimplemented!`).