RISC-V ELF Transpilation to OpenVM Executable
The OpenVM framework supports transpilation of a RISC-V ELF consisting of the RV32IM instruction set as well as custom RISC-V instructions specified by VM extensions into an OpenVM executable.
Transpiler Framework
The transpiler is a function that converts a RISC-V ELF into an OpenVM executable, where an OpenVM executable is defined as the following pieces of data:
- Program ROM
- Starting program counter
pc_0 - Initial data memory
The OpenVM executable forms a part of the initial VM state.
We define a RISC-V machine code block to be a 32-bit aligned contiguous sequence of bits in the RISC-V program memory, where the bit length is variable and a multiple of 32. The code block may contain instructions from standard or non-standard RISC-V ISA extensions, but it may also contain arbitrary bits.
The transpiler is configured upon construction with the set of VM extensions to support. In order to be supported by the transpiler, a VM extension must specify a set of RISC-V machine code blocks and rules for mapping each code block to a sequence of potentially multiple OpenVM instructions. Extensions may also modify the executable's initial memory image.
The transpilation rules must satisfy:
- A read or write to the RISC-V program counter corresponds to a read or write to the program counter of the same value in OpenVM. This includes the implicit read of the program counter to fetch the instruction from program code, as well as any implicit
pc += 4advancement in some RISC-V instructions. - A RISC-V 32-bit register
x{i}read or write access corresponds to an OpenVM memory access at[4 * i: 4]_1except for writes tox0, see below. The 32-bits ofx{i}are represented as 4 little-endian bytes in OpenVM memory.- A RISC-V code block must never map to any OpenVM instruction that changes the value of
[0:4]_1in OpenVM memory.
- A RISC-V code block must never map to any OpenVM instruction that changes the value of
- A RISC-V 32-bit user memory access of the
jth byte in wordicorresponds to an OpenVM memory access at[4 * i + j]_2. - If the RISC-V code block is a standard instruction from the RISC-V Instruction Set Manual Volume I: Unprivileged ISA (pdf), then the transpilation rule must map the RISC-V instruction to an OpenVM instruction that follows the RISC-V specification after applying the above correspondences to register and memory accesses.
The above requirements, together with the invariants of the OpenVM ISA, imply that transpilation will only be valid for programs where:
- The program code does not have program address greater than or equal to
2^PC_BITS. - The program does not access memory outside the range
[0, 2^pointer_max_bits): programs that attempt such accesses will fail to execute.
A transpiler configuration is only considered valid if there are no two transpilation rules that may map the same RISC-V code block to different OpenVM instructions.
- When defining a new VM extension with transpiler support, the associated RISC-V code blocks should be chosen to avoid conflicts with RISC-V code blocks from other pre-existing VM extensions that the new VM extension expects to be compatible with.
Register x0 Handling
As specified in Section 2.1 of RISC-V Instruction Set Manual Volume I: Unprivileged ISA (pdf), register x0 is hardwired to zero and must never be written to.
The OpenVM ISA treats [0:4]_1 as normal read/write memory and makes no guarantees on memory accesses to this location. The transpiler must never transpile a RISC-V code block to any OpenVM instruction that changes the value of [0:4]_1 in OpenVM memory. For compatibility with the RISC-V ISA, the transpiler must always transpile a RISC-V instruction to an OpenVM instruction that matches the RISC-V specification. In particular, any RISC-V instruction that has rd=x0 must be transpiled to either the NOP OpenVM instruction if it has no side effects or to an OpenVM instruction that executes the expected side effect and does not change the value of [0:4]_1.
Transpiler Specification for Default VM Extensions
This section specifies the behavior of the transpiler for the default VM extensions with the custom RISC-V instructions specified here. We use the following notation:
- Let
ind(rd)denote4 * (register index), which is in0..128. In particular, it fits in one field element. - We use
itoffor the function that sends an immediate to its signed integer representation, as specified in Section 2.3 of RISC-V Instruction Set Manual Volume I: Unprivileged ISA (signed 12-bit, 13-bit, 21-bits for I-type, B-type, J-type, respectively), and then maps it to the corresponding field element. So0b11…11goes to-1inF. - We use
sign_extend_24to convert a 12-bit integer into a 24-bit integer via sign extension. We use this in conjunction withutof, which converts 24 bits into an unsigned integer and then maps it to the corresponding field element. Note that each 24-bit unsigned integer fits in one field element. - We use
sign_extend_16for the analogous conversion into a 16-bit integer via sign extension. - We use
zero_extend_24to convert an unsigned integer with at most 24 bits into a 24-bit unsigned integer by zero extension. This is used in conjunction withutofto convert unsigned integers to field elements. - We use
sign_of(imm)to get the sign bit of the immediateimm. - The notation
imm[0:4]means the lowest 5 bits of the immediate. - For a phantom instruction
ins,disc(ins)is the discriminant specified in the ISA specification. - For a phantom instruction
insand a 16-bitc_upper,phantom_c(c_upper, ins) = c_upper << 16 | disc(ins)is the corresponding 32-bit operandcfor PHANTOM.
We now specify the transpilation for system instructions and the default set of VM extensions.
System Instructions
| RISC-V Inst | OpenVM Instruction |
|---|---|
| terminate | TERMINATE _, _, utof(imm) where imm must fit in u8 |
RV32IM Extension
Transpilation from RV32IM to OpenVM assembly follows the mapping below, which is generally
a 1-1 translation between RV32IM instructions and OpenVM instructions. The main exception relates
to handling of the x0 register, which discards writes and has value 0 in all reads.
We handle writes to x0 in transpilation as follows:
- Instructions that write to
x0with no side effects are transpiled to the PHANTOM instruction withc = 0x00(Nop). - Instructions that write to a register which might be
x0with side effects (JAL, JALR) are transpiled to the corresponding custom instruction whose write behavior is controlled by a flag specifying whether the target register isx0.
Because [0:4]_1 is initialized to 0 and never written to, this guarantees that reads from x0 yield 0 and enforces that any OpenVM program transpiled from RV32IM conforms to the RV32IM specification for x0.
System Level Extensions to RV32IM
| RISC-V Inst | OpenVM Instruction |
|---|---|
| hintstorew | HINT_STOREW_RV32 0, ind(rd), _, 1, 2 |
| hintbuffer | HINT_BUFFER_RV32 ind(rs1), ind(rd), _, 1, 2 |
| reveal | STOREW_RV32 ind(rs1), ind(rd), utof(sign_extend_16(imm)), 1, 3, 1, sign_of(imm) |
| hintinput | PHANTOM _, _, disc(Rv32HintInput) |
| printstr | PHANTOM ind(rd), ind(rs1), disc(Rv32PrintStr) |
| hintrandom | PHANTOM ind(rd), _, disc(Rv32HintRandom) |
Standard RV32IM Instructions
| RISC-V Inst | OpenVM Instruction |
|---|---|
| add | ADD_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| sub | SUB_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| xor | XOR_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| or | OR_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| and | AND_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| sll | SLL_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| srl | SRL_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| sra | SRA_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| slt | SLT_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| sltu | SLTU_RV32 ind(rd), ind(rs1), ind(rs2), 1, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| addi | ADD_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| xori | XOR_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| ori | OR_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| andi | AND_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| slli | SLL_RV32 ind(rd), ind(rs1), utof(zero_extend_24(imm[0:4])), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| srli | SRL_RV32 ind(rd), ind(rs1), utof(zero_extend_24(imm[0:4])), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| srai | SRA_RV32 ind(rd), ind(rs1), utof(zero_extend_24(imm[0:4])), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| slti | SLT_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| sltiu | SLTU_RV32 ind(rd), ind(rs1), utof(sign_extend_24(imm)), 1, 0 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| lb | LOADB_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 2, (rd != x0), sign_of(imm) |
| lh | LOADH_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 2, (rd != x0), sign_of(imm) |
| lw | LOADW_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 2, (rd != x0), sign_of(imm) |
| lbu | LOADBU_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 2, (rd != x0), sign_of(imm) |
| lhu | LOADHU_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 2, (rd != x0), sign_of(imm) |
| sb | STOREB_RV32 ind(rs2), ind(rs1), utof(sign_extend_16(imm)), 1, 2, 1, sign_of(imm) |
| sh | STOREH_RV32 ind(rs2), ind(rs1), utof(sign_extend_16(imm)), 1, 2, 1, sign_of(imm) |
| sw | STOREW_RV32 ind(rs2), ind(rs1), utof(sign_extend_16(imm)), 1, 2, 1, sign_of(imm) |
| beq | BEQ_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| bne | BNE_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| blt | BLT_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| bge | BGE_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| bltu | BLTU_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| bgeu | BGEU_RV32 ind(rs1), ind(rs2), itof(imm), 1, 1 |
| jal | JAL_RV32 ind(rd), 0, itof(imm), 1, 0, (rd != x0) |
| jalr | JALR_RV32 ind(rd), ind(rs1), utof(sign_extend_16(imm)), 1, 0, (rd != x0), sign_of(imm) |
| lui | LUI_RV32 ind(rd), 0, utof(zero_extend_24(imm[12:31])), 1, 0, 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| auipc | AUIPC_RV32 ind(rd), 0, utof(zero_extend_24(imm[12:31]) << 4), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| mul | MUL_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| mulh | MULH_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| mulhsu | MULHSU_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| mulhu | MULHU_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| div | DIV_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| divu | DIVU_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| rem | REM_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| remu | REMU_RV32 ind(rd), ind(rs1), ind(rs2), 1 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
OpenVM Intrinsic VM Extensions
The following sections specify the transpilation of the default set of intrinsic extensions
to OpenVM. In order to preserve correctness of handling of x0, the transpilation must respect
the constraint that any instruction that writes to a register must:
- Transpile to
Nopif the register isx0and there are no side effects. - Transpile to an OpenVM assembly instruction that does not write to
[0:4]_1and processes side effects if the register isx0and there are side effects.
Each VM extension's behavior is specified below.
Keccak Extension
| RISC-V Inst | OpenVM Instruction |
|---|---|
| xorin | XORIN_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| keccakf | KECCAKF_RV32 ind(rd), _, _, 1, 2 |
SHA-2 Extension
| RISC-V Inst | OpenVM Instruction |
|---|---|
| sha256_update | SHA256_UPDATE_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sha512_update | SHA512_UPDATE_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
Deferral Extension
The 12-bit immediate encodes both the sub-opcode and deferral index: imm[1:0] selects CALL_RV32 (0x0) vs OUTPUT_RV32 (0x1), and def_idx = imm[11:2] identifies which deferral function to invoke. In addition to transpiling instructions, the deferral transpiler extension writes the initial input accumulator values (the def_circuit_commit for each configured deferral circuit) into the deferral address space (address space 4) via modify_initial_memory.
| RISC-V Inst | OpenVM Instruction |
|---|---|
| def_call | CALL_RV32 ind(rd), ind(rs1), def_idx, 1, 2 |
| def_output | OUTPUT_RV32 ind(rd), ind(rs1), def_idx, 1, 2 |
BigInt Extension
| RISC-V Inst | OpenVM Instruction |
|---|---|
| add256 | ADD256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sub256 | SUB256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| xor256 | XOR256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| or256 | OR256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| and256 | AND256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sll256 | SLL256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| srl256 | SRL256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sra256 | SRA256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| slt256 | SLT256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sltu256 | SLTU256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| mul256 | MUL256_RV32 ind(rd), ind(rs1), ind(rs2), 1, 2 |
| beq256 | BEQ256_RV32 ind(rs1), ind(rs2), itof(imm), 1, 2 |
Algebra Extension
Modular Arithmetic
| RISC-V Inst | OpenVM Instruction |
|---|---|
| addmod<N> | ADDMOD_RV32<N> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| submod<N> | SUBMOD_RV32<N> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| mulmod<N> | MULMOD_RV32<N> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| divmod<N> | DIVMOD_RV32<N> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| iseqmod<N> | ISEQMOD_RV32<N> ind(rd), ind(rs1), ind(rs2), 1, 2 if rd != x0, otherwise PHANTOM _, _, disc(Nop) |
| setup<N> | SETUP_ADDSUBMOD_RV32<N> ind(rd), ind(rs1), 0, 1, 2 if rs2 = x0, SETUP_MULDIVMOD_RV32<N> ind(rd), ind(rs1), 0, 1, 2 if rs2 = x1, SETUP_ISEQMOD_RV32<N> ind(rd), ind(rs1), 0, 1, 2 if rs2 = x2 |
| hint_non_qr | PHANTOM 0, 0, phantom_c(mod_idx, HintNonQr) |
| hint_sqrt | PHANTOM ind(rs1), 0, phantom_c(mod_idx, HintSqrt) |
Complex Extension Field Arithmetic
| RISC-V Inst | OpenVM Instruction |
|---|---|
| addcomplex | ADD_RV32<Fp2> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| subcomplex | SUB_RV32<Fp2> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| mulcomplex | MUL_RV32<Fp2> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| divcomplex | DIV_RV32<Fp2> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| setupcomplex | SETUP_ADDSUB_RV32<Fp2> ind(rd), ind(rs1), 0, 1, 2 if rs2 = x0, SETUP_MULDIV_RV32<Fp2> ind(rd), ind(rs1), 0, 1, 2 if rs2 = x1 |
Elliptic Curve Extension
| RISC-V Inst | OpenVM Instruction |
|---|---|
| sw_add_ne<C> | EC_ADD_NE_RV32<C> ind(rd), ind(rs1), ind(rs2), 1, 2 |
| sw_double<C> | EC_DOUBLE_RV32<C> ind(rd), ind(rs1), 0, 1, 2 |
| setup<C> | SETUP_EC_ADD_NE_RV32<C> ind(rd), ind(rs1), ind(rs2), 1, 2 if ind(rs2) != 0, SETUP_EC_DOUBLE_RV32<C> ind(rd), ind(rs1), ind(rs2), 1, 2 if ind(rs2) = 0 |
Pairing Extension
| RISC-V Inst | OpenVM Instruction |
|---|---|
| hint_final_exp | PHANTOM ind(rs1), ind(rs2), phantom_c(pairing_idx, HintFinalExp) |