Skip to content

OpenVM Instruction Set Architecture

OpenVM supports an extensible instruction set, with different groups of opcodes supported by different VM extensions. This specification describes the overall architecture and default VM extensions which ship with OpenVM, which are:

  • RV32IM: An extension supporting the 32-bit RISC-V ISA with multiplication.
  • Keccak-256: An extension implementing low-level Keccak sponge operations (XOR-in and keccak-f[1600] permutation) compatibly with RISC-V memory, used to implement the Keccak-256 hash function.
  • SHA2: An extension implementing the SHA-256, SHA-384, and SHA-512 hash functions compatibly with RISC-V memory.
  • BigInt: An extension supporting 256-bit signed and unsigned integer arithmetic, including multiplication. This extension respects the RISC-V memory format.
  • Algebra: An extension supporting modular arithmetic over arbitrary fields and their complex field extensions. This extension respects the RISC-V memory format.
  • Elliptic curve: An extension for elliptic curve operations over Weierstrass curves, including addition and doubling. This can be used to implement multi-scalar multiplication and ECDSA scalar multiplication. This extension respects the RISC-V memory format.
  • Pairing: An extension containing opcodes used to implement the optimal Ate pairing on the BN254 and BLS12-381 curves. This extension respects the RISC-V memory format.
  • Deferral: An extension for deferred computation, where results are committed during app execution and verified during proof aggregation.

In addition to these default extensions, developers are able to extend the ISA by defining their own custom VM extensions. For reader convenience, we provide a mapping between the code-level representation of opcodes in OpenVM and the opcodes below here.

Constants and Configuration Parameters

OpenVM depends on the following parameters, some of which are fixed and some of which are configurable:

NameDescriptionConstraints
FThe field over which the VM operates.Currently fixed to Baby Bear, but may change to another 31-bit field.
PC_BITSThe number of bits in the program counter.Fixed to 30.
DEFAULT_PC_STEPThe default program counter step size.Fixed to 4.
LIMB_BITSThe number of bits in a limb for RISC-V memory emulation.Fixed to 8.
ADDR_SPACE_OFFSETThe index of the first writable address space.Fixed to 1.
addr_space_heightThe base 2 log of the number of writable address spaces supported.Configurable, must satisfy addr_space_height <= F::bits() - 2
pointer_max_bitsThe maximum number of bits in a pointer.Configurable, must satisfy pointer_max_bits <= F::bits() - 2
num_public_valuesThe number of user public values.Configurable, must equal 8 times a nonzero power of two.
MAX_HINT_BUFFER_WORDS_BITSThe maximum number of bits for hint buffer word count. This determines MAX_HINT_BUFFER_WORDS = 2^MAX_HINT_BUFFER_WORDS_BITS - 1 = 1,023 words (≈4KB), the maximum words per HINT_BUFFER_RV32 instruction.Fixed to 10.

We explain these parameters in subsequent sections.

Prime Field

The ISA globally depends on a prime field F. This field is currently fixed to Baby Bear, with modulus 15 * 2^27 + 1, but it may change in the future to another 31-bit field. Unless otherwise specified, we will identify the elements of F with the integers {0, ..., F::modulus() - 1}, where F::modulus() is the prime modulus of the field.

Virtual Machine State

The virtual machine is a state machine that is executed on a physical host machine. The state of the virtual machine consists of the following components:

Guest State:

Host State:

The initial state of the virtual machine consists of:

  • Program ROM - immutable throughout VM execution
  • pc_0 - starting program counter
  • Initial data memory
  • No user public outputs
  • Input stream
  • Empty hint stream
  • Deferral states

We describe these components in more detail below.

Program ROM

OpenVM operates under the Harvard architecture, where program code is stored separately from data memory. The program code is loaded as read-only memory (ROM) in the VM state prior to execution, and it remains immutable throughout the execution.

Program code is a map from [0, 2^PC_BITS) to the space of instructions F^{NUM_OPERANDS + 1}, where

  • PC_BITS = 30
  • NUM_OPERANDS = 7.

Instructions will typically only exist at a subset of the indices in [0, 2^PC_BITS).

Instruction format

Instructions are encoded as a global opcode (field element) followed by NUM_OPERANDS = 7 operands (field elements):

opcode, a, b, c, d, e, f, g

An instruction does not need to use all operands, and trailing unused operands are suggested to be set to zero. This may not be constrained, however.

In the following sections, you will see operands like a, b, c, 1, e. 1 here means a fixed address space. In this case, d is suggested to be set to 1. This also may not be constrained, though.

Program Counter

There is a single special purpose register pc for the program counter of type F which stores the location of the instruction being executed. During execution, the program counter must always be a valid program address, meaning that it is an element in the range [0, 2^PC_BITS) where the program code is defined.

Data Memory

Data memory is a random access memory (RAM) which supports read and write operations. Memory is comprised of addressable cells which represent a single field element indexed by address space and pointer. The number of supported address spaces and the size of each address space are configurable constants.

  • Valid address spaces not used for immediates lie in [1, 1 + 2^addr_space_height) for configuration constant addr_space_height.

  • Valid pointers are field elements that lie in [0, 2^pointer_max_bits), for configuration constant pointer_max_bits. When accessing an address out of [0, 2^pointer_max_bits), the VM should panic.

  • For the register address space (address space 1), valid pointers lie in [0, 128), corresponding to 32 registers with 4 byte limbs each. These configuration constants must satisfy addr_space_height, pointer_max_bits <= F::bits() - 2. We use the following notation to denote cells in memory:

  • [a]_d denotes the single-cell value at pointer location a in address space d. This is a single field element.

  • [a:N]_d denotes the slice [a..a + N]_d -- this is a length-N array of field elements.

Immediates

We reserve a special address space 0 for immediates. The framework enforces that address space 0 is never written to. Address space 0 is considered a read-only array with [a]_0 = a for any a in F.

Memory Accesses and Block Accesses

VM instructions can access (read or write) a contiguous list of cells (called a block) in a single address space. The block size is fixed to constant size 4 across all instructions in the ISA. All block accesses must be at pointers that are a multiple of the block size (4). However, RISC-V instructions like lb, lh, sb, and sh still work despite having minimum block size requirements equal to the size of the access (1 byte for lb/sb, 2 bytes for lh/sh) because these instructions are implemented by doing a block access of size 4. Block accesses are not supported for address space 0.

Address Spaces

Different address spaces are used for different purposes in OpenVM. Memory cells in all address spaces are always field elements, but certain address spaces may impose the additional constraint that all elements fit into a maximum number of bits. The existing extensions reference the following set of address spaces, but user-defined extensions are free to introduce additional address spaces:

Address SpaceNameNotes and Constraints
0ImmediatesAddress space 0 is reserved for denoting immediates, and we define [a]_0 = a.
1RegistersElements are constrained to lie in [0, 2^LIMB_BITS) for LIMB_BITS = 8.
2User MemoryElements are constrained to lie in [0, 2^LIMB_BITS) for LIMB_BITS = 8.
3User IO
4DeferralElements are typically full native field elements.

When adding a new user address space, the invariants of the memory cells in that address space must be declared, and all instructions must ensure that these invariants are preserved.

ℹ️ When adding a new instruction to the ISA, the instruction must declare its supported address spaces and respect the invariants of those address spaces. In particular, all instructions must respect the invariants of the address spaces above.

Inputs and Hints

To enable user input and non-determinism in OpenVM programs, the host state maintains the following data structures during runtime execution:

  • input_stream: a private non-interactive queue of vectors of field elements which is provided at the start of runtime execution
  • hint_stream: a queue of values populated during runtime execution via phantom sub-instructions such as Rv32HintInput.
  • deferrals: a list of DeferralState objects, one per deferral function. Each DeferralState maps input commitments to raw inputs (or output commitments) and output commitments to raw outputs, enabling the VM to cache results of computations deferred to standalone circuits.

These data structures are not part of the guest state, and their state depends on host behavior that cannot be determined by the guest.

User Public Outputs

To make program outputs public, OpenVM allows the user to specify a list of field elements to make public. To populate this list, users can use REVEAL_RV32 (from the RV32IM extension).

The list is of length num_public_values, where num_public_values is a VM configuration parameter. By default, any element in the list that is never initialized is set to zero.

Instruction Execution

Starting from the initial VM state, the VM executes instructions according to the program ROM. Instruction execution is a transition function on the mutable parts of the VM state:

  • Program Counter
  • Data Memory
  • User Public Outputs
  • Input Stream
  • Hint Stream
  • Deferral States

which must satisfy the following conditions:

  • Let from_pc be the program counter at the start of the instruction execution. The from_pc must be the address of a valid instruction in the program ROM.
  • The execution must match the instruction from the program ROM.
  • The execution has full read/write access to the data memory, except address space 0 must be read-only.
  • User public outputs can be set at any index in [0, num_public_values).
  • Input stream can only be popped from the front as a queue.
  • Full read/write access to the hint stream.
  • Deferral states are pre-computed at initialization with input-to-raw-input mappings. During execution, deferral functions may update a DeferralState by storing raw outputs and their commitments, which are as specified here.
  • The program counter is set to a new to_pc at the end of the instruction execution. Instructions are only considered valid if to_pc is the address of a valid instruction in the program ROM.

ℹ️ Notes for extension developers: The specification of new instructions should carefully consider to_pc overflows, especially when you want to move pc with a positive offset.

Guest Instruction Execution

We define guest instruction execution to be the subset of instruction execution that only mutates the guest state:

  • Program Counter
  • Data Memory
  • User Public Outputs

Guest instruction execution may still depend on read access to the host state. For example, instructions like HINT_STOREW_RV32 (from the RV32IM extension) can read from the hint_stream and write them to OpenVM memory to provide non-deterministic hints.

⚠️ Safeguards:

  • All instructions should ensure that the modifications to the guest state are protected from non-deterministic host states, as the guest has no control over the host state. For example, the start address and length of modified guest memory must be derived from instruction operands or guest state (as opposed to being derived from host state).

Phantom Instructions

To facilitate hinting and debugging on the host, OpenVM supports the notion of phantom instructions. These are instructions which are identical to a no-op at the level of the OpenVM guest state, but which may be used to specify unconstrained behavior on the host. Use cases of phantom instructions include interacting with the input or hint streams or displaying debug information on the host machine.

Notes for extension developers: PhantomDiscriminant should be unique for each phantom instruction. If you want your new extension to be compatible with some extensions, you need select some compatible PhantomDiscriminant.

OpenVM Instruction Set

We now specify instructions supported by the default VM extensions shipping with OpenVM. Unless otherwise specified, instructions will set to_pc = from_pc + DEFAULT_PC_STEP. We will use the following notation:

  • u32(x) where x: [F; 4] consists of 4 bytes will mean the casting from little-endian bytes in 2's complement to unsigned 32-bit integer.
  • i32(x) where x: [F; 4] consists of 4 bytes will mean the casting from little-endian bytes in 2's complement to signed 32-bit integer.
  • sign_extend means sign extension of bits in 2's complement.
  • i32(c) where c is a field element will mean c.as_canonical_u32() if c.as_canonical_u32() < F::modulus() / 2 or c.as_canonical_u32() - F::modulus() as i32 otherwise.
  • decompose(c) where c is a field element means c.as_canonical_u32().to_le_bytes().
  • r32{0}(a) := i32([a:4]_1) means casting the value at [a:4]_1 to i32.

In the specification, operands marked with _ are not used and should be set to zero. Trailing unused operands should also be set to zero.

System Instructions

The opcodes below are supported by the OpenVM system and do not belong to any VM extension.

NameOperandsDescription
TERMINATE_, _, cTerminates execution with exit code c. Sets to_pc = from_pc.
PHANTOM_, _, cSets to_pc = from_pc + DEFAULT_PC_STEP. The operand c determines which phantom instruction (see below) is run.

The behavior of the PHANTOM opcode is determined by the operand c. More specifically, the low 16-bits c.as_canonical_u32() & 0xffff are used as a discriminant to determine a phantom sub-instruction. Phantom sub-instructions supported by the system are listed below, and VM extensions can define additional phantom sub-instructions. Phantom sub-instructions are only allowed to use operands a,b and c_upper = c.as_canonical_u32() >> 16 and must always advance the program counter by DEFAULT_PC_STEP.

NameDiscriminantOperandsDescription
Nop0x00_Does nothing.
DebugPanic0x01_Causes the runtime to panic on the host machine and prints a backtrace if RUST_BACKTRACE=1 is set.
CtStart0x02_Opens a new span for tracing.
CtEnd0x03_Closes the current span.

RV32IM Extension

The RV32IM extension introduces OpenVM opcodes which support 32-bit RISC-V via transpilation from a standard RV32IM ELF binary, specified here. These consist of opcodes corresponding 1-1 with RV32IM opcodes, as well as additional user IO opcodes and phantom sub-instructions to support input and debug printing on the host. We denote the OpenVM opcode corresponding to a RV32IM opcode by appending _RV32.

The RV32IM extension uses address space 0 for immediates, address space 1 for registers, and address space 2 for memory. As explained here, cells in address spaces 1 and 2 are constrained to be bytes, and all instructions preserve this constraint.

The ith RISC-V register is represented by the block [4 * i:4]_1 of 4 limbs in address space 1. All memory addresses in address space 1 behave uniformly, and in particular writes to the block [0:4]_1 corresponding to the RISC-V register x0 are allowed in the RV32IM extension. However, as detailed in RV32IM Transpilation, any OpenVM program transpiled from a RV32IM ELF will never contain such a write and conforms to the RV32IM specification.

ALU

In all ALU instructions, the operand d is fixed to be 1. The operand e must be either 0 or 1. When e = 0, the c operand is expected to be of the form F::from_u32(c_i16 as i24 as u24 as u32) where c_i16 is type i16. In other words we take signed 16-bits in two's complement, sign extend to 24-bits, consider the 24-bits as unsigned integer, and convert to field element. In the instructions below, [c:4]_0 should be interpreted as c_i16 as i32 sign extended to 32-bits.

NameOperandsDescription
ADD_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 + [c:4]_e. Overflow is ignored and the lower 32-bits are written to the destination.
SUB_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 - [c:4]_e. Overflow is ignored and the lower 32-bits are written to the destination.
XOR_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 ^ [c:4]_e
OR_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 | [c:4]_e
AND_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 & [c:4]_e
SLL_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 << [c:4]_e
SRL_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 >> [c:4]_e
SRA_RV32a,b,c,1,e[a:4]_1 = [b:4]_1 >> [c:4]_e MSB extends
SLT_RV32a,b,c,1,e[a:4]_1 = i32([b:4]_1) < i32([c:4]_e) ? 1 : 0
SLTU_RV32a,b,c,1,e[a:4]_1 = u32([b:4]_1) < u32([c:4]_e) ? 1 : 0

Load/Store

For all load/store instructions, we assume the operand c is in [0, 2^16), and we fix address spaces d = 1. The address space e is 2 for load instructions, and can be 2, 3, or 4 for store instructions. While e = 4 (the deferral address space) is technically allowed by the circuit constraints, verifiers should be extremely wary of any program that stores to address space 4 via these instructions and typically should not consider such programs as valid.

The operand g must be a boolean. We let sign_extend(decompose(c)[0:2], g) denote the i32 defined by first taking the unsigned integer encoding of c as 16 bits, then sign extending it to 32 bits using the sign bit g, and considering the 32 bits as the 2's complement of an i32.

We will use shorthand r32{c,g}(b) := i32([b:4]_1) + sign_extend(decompose(c)[0:2], g) as i32. This means performing signed 32-bit addition with the value of the register [b:4]_1. For consistency with other notation, we define the shorthand r32{c}(b) to mean r32{c,g}(b) where g is set to the most significant bit of c.

Memory access to ptr: i32 in address space e is only valid if 0 <= ptr < 2^pointer_max_bits and ptr is naturally aligned (i.e., ptr must be divisible by the data size in bytes), in which case it is an access to F::from_u32(ptr as u32). The data size is 1 for LOADB_RV32, LOADBU_RV32, STOREB_RV32, 2 for LOADH_RV32, LOADHU_RV32, STOREH_RV32, and 4 for LOADW_RV32, STOREW_RV32.

All load/store instructions always do block accesses of block size 4, even for LOADB_RV32 and STOREB_RV32.

NameOperandsDescription
LOADB_RV32a,b,c,1,e,f,gif(f!=0) [a:4]_1 = sign_extend([r32{c,g}(b):1]_e) The operand f is assumed to be boolean. Must sign-extend the byte read from memory, which is represented in 2’s complement.
LOADH_RV32a,b,c,1,e,f,gif(f!=0) [a:4]_1 = sign_extend([r32{c,g}(b):2]_e) The operand f is assumed to be boolean. Must sign-extend the number read from memory, which is represented in 2’s complement.
LOADW_RV32a,b,c,1,e,f,gif(f!=0) [a:4]_1 = [r32{c,g}(b):4]_e The operand f is assumed to be boolean.
LOADBU_RV32a,b,c,1,e,f,gif(f!=0) [a:4]_1 = zero_extend([r32{c,g}(b):1]_e) The operand f is assumed to be boolean. Must zero-extend the number read from memory.
LOADHU_RV32a,b,c,1,e,f,gif(f!=0) [a:4]_1 = zero_extend([r32{c,g}(b):2]_e) The operand f is assumed to be boolean. Must zero-extend the number read from memory.
STOREB_RV32a,b,c,1,e,1,g[r32{c,g}(b):1]_e <- [a:1]_1
STOREH_RV32a,b,c,1,e,1,g[r32{c,g}(b):2]_e <- [a:2]_1
STOREW_RV32a,b,c,1,e,1,g[r32{c,g}(b):4]_e <- [a:4]_1

Branch/Jump/Upper Immediate

For branch instructions, we fix d = e = 1. For jump instructions, we fix d = 1.

NameOperandsDescription
BEQ_RV32a,b,c,1,1if([a:4]_1 == [b:4]_1) pc += c
BNE_RV32a,b,c,1,1if([a:4]_1 != [b:4]_1) pc += c
BLT_RV32a,b,c,1,1if(i32([a:4]_1) < i32([b:4]_1)) pc += c
BGE_RV32a,b,c,1,1if(i32([a:4]_1) >= i32([b:4]_1)) pc += c
BLTU_RV32a,b,c,1,1if(u32([a:4]_1) < u32([b:4]_1)) pc += c
BGEU_RV32a,b,c,1,1if(u32([a:4]_1) >= u32([b:4]_1)) pc += c
JAL_RV32a,_,c,1,_,fif(f!=0) [a:4]_1 = decompose(pc+4); pc += c. The operand f is assumed to be boolean. The pc increment is always done regardless of f's value. Here i32(c)must be in[-2^24, 2^24).
JALR_RV32a,b,c,1,_,f,gif(f!=0) [a:4]_1 = decompose(pc+4); pc = F::from_u32(i32([b:4]_1) + sign_extend(decompose(c)[0:2], g) as u32). Constrains that i32([b:4]_1) + sign_extend(decompose(c)[0:2], g) is in [0, 2^PC_BITS). Here u32(c) must be in [0, 2^16). The operands f and g are assumed to be boolean. The pc assignment is always done regardless of f's value.
LUI_RV32a,_,c,1,_,1[a:4]_1 = u32(c) << 12. Here i32(c) must be in [0, 2^20).
AUIPC_RV32a,_,c,1,_,_[a:4]_1 = decompose(pc) + (decompose(c) << 8). Here i32(c) must be in [0, 2^24).

For branch and JAL_RV32 instructions, the instructions assume that the operand i32(c) is in [-2^24,2^24). The assignment pc += c is done as field elements. In valid programs, the from_pc is always in [0, 2^PC_BITS). We will only use base field F satisfying 2^PC_BITS + 2*2^24 < F::modulus() so to_pc = from_pc + c is only valid if i32(from_pc) + i32(c) is in [0, 2^PC_BITS).

For JALR_RV32, we treat c in [0, 2^16) as a raw encoding of 16-bits. The operand g must be a boolean. We let sign_extend(decompose(c)[0:2], g) denote the i32 defined by first taking the unsigned integer encoding of c as 16 bits, then sign extending it to 32 bits using the sign bit g, and considering the 32 bits as the 2's complement of an i32. Then it is added to the register value i32([b:4]_1), where 32-bit overflow is ignored. The instruction is only valid if the resulting i32 is in range [0, 2^PC_BITS). The result is then cast to u32 and then to F and assigned to pc.

For LUI_RV32, we are treating c in [0, 2^20) as a raw encoding of 20-bits. For AUIPC_RV32, we are treating c in [0, 2^24) as a raw encoding of 24-bits. The instruction does not need to interpret whether the register is signed or unsigned. For AUIPC_RV32, the addition is treated as unchecked u32 addition since that is the same as i32 addition at the bit level.

Note that AUIPC_RV32 does not have any condition for the register write.

RV32M Extension

For multiplication extension instructions, we fix d = 1. MUL_RV32 performs an 32-bit×32-bit multiplication and places the lower 32 bits in the destination cells. MULH_RV32, MULHU_RV32, and MULHSU_RV32 perform the same multiplication but return the upper 32 bits of the full 2×32-bit product, for signed×signed, unsigned×unsigned, and signed×unsigned multiplication respectively.

DIV_RV32 and DIVU_RV32 perform signed and unsigned integer division of 32-bits by 32-bits. REM_RV32 and REMU_RV32 provide the remainder of the corresponding division operation. Integer division is defined by dividend = q * divisor + r where 0 <= |r| < |divisor| and either sign(r) = sign(dividend) or r = 0.

Below x[n:m] denotes the bits from n to m inclusive of x.

NameOperandsDescription
MUL_RV32a,b,c,1[a:4]_1 = ([b:4]_1 * [c:4]_1)[0:3]
MULH_RV32a,b,c,1[a:4]_1 = (sign_extend([b:4]_1) * sign_extend([c:4]_1))[4:7]. We sign extend b and c into 8-limb (i.e. 64-bit) integers
MULHSU_RV32a,b,c,1[a:4]_1 = (sign_extend([b:4]_1) * zero_extend([c:4]_1))[4:7]. We sign extend
MULHU_RV32a,b,c,1[a:4]_1 = (zero_extend([b:4]_1) * zero_extend([c:4]_1))[4:7]
DIV_RV32a,b,c,1[a:4]_1 = [b:4]_1 / [c:4]_1 integer division. Division by zero: if i32([c:4]_1) = 0, set i32([a:4]_1) = -1. Overflow: if i32([b:4]_1) = -2^31 and i32([c:4]_1) = -1, set i32([a:4]_1) = -2^31.
DIVU_RV32a,b,c,1[a:4]_1 = [b:4]_1 / [c:4]_1 integer division. Division by zero: if u32([c:4]_1) = 0, set u32([a:4]_1) = 2^32 - 1.
REM_RV32a,b,c,1[a:4]_1 = [b:4]_1 % [c:4]_1 integer remainder. Division by zero: if i32([c:4]_1) = 0, set [a:4]_1 = [b:4]_1. Overflow: if i32([b:4]_1) = -2^31 and i32([c:4]_1) = -1, set [a:4]_1 = 0.
REMU_RV32a,b,c,1[a:4]_1 = [b:4]_1 % [c:4]_1 integer remainder. Division by zero: if u32([c:4]_1) = 0, set [a:4]_1 = [b:4]_1.

User IO

In addition to opcodes which match 1-1 with the RV32IM opcodes, the following additional opcodes interact with address spaces outside of 1 and 2 to enable verification of programs with user input-output.

NameOperandsDescription
HINT_STOREW_RV32_,b,_,1,2[r32{0}(b):4]_2 = next 4 bytes from hint stream. Only valid if next 4 values in hint stream are bytes.
HINT_BUFFER_RV32a,b,_,1,2[r32{0}(b):4 * l]_2 = next 4 * l bytes from hint stream where l = r32{0}(a). Only valid if next 4 * l values in hint stream are bytes. l must be non-zero and <= MAX_HINT_BUFFER_WORDS (1,023 words ≈ 4KB). The pointer address r32{0}(b) does not need to be a multiple of 4.
REVEAL_RV32a,b,c,1,3,1,gPseudo-instruction for STOREW_RV32 a,b,c,1,3,1,g writing to the user IO address space 3.

Note: The MAX_HINT_BUFFER_WORDS bound on HINT_BUFFER_RV32 is enforced by both the executor and AIR constraints. The SDK's hint_buffer_chunked function automatically splits larger reads into multiple HINT_BUFFER_RV32 instructions.

Phantom Sub-Instructions

The RV32IM extension defines the following phantom sub-instructions.

NameDiscriminantOperandsDescription
Rv32HintInput0x20_Pops a vector hint of field elements from the input stream and resets the hint stream to equal [(hint.len() as u32).to_le_bytes(), hint, zero padding to 4-byte alignment].concat().
Rv32PrintStr0x21a,b,_Peeks at [r32{0}(a)..r32{0}(a) + r32{0}(b)]_2, tries to convert to byte array and then UTF-8 string and prints to host stdout. Prints error message if conversion fails. Does not change any VM state.
Rv32HintRandom0x22a,_,_Resets the hint stream to 4 * r32{0}(a) random bytes. The source of randomness is deterministic using a fixed-seed RNG (rand::rngs::StdRng). Its result is not constrained in any way.

Deferral Extension

The deferral extension enables guest programs to defer expensive computations to the proof aggregation phase. During app execution, the guest provides an input commitment and receives back a commitment to the output. The actual computation is performed later during aggregation and verified by a deferral hook circuit.

Each deferral instruction encodes a deferral index (def_idx) identifying which deferred function to invoke. The index and sub-opcode are packed into a 12-bit immediate as [def_idx(10 bits) | opcode(2 bits)], supporting up to 1024 deferral circuits.

The extension operates on address spaces 1 (registers) and 2 (user memory) and uses address space 4 (deferral) for accumulator state.

NameOperandsDescription
CALL_RV32a,b,def_idx,1,2Executes a deferred call for deferral index def_idx. Reads a 32-byte input commitment from [r32{0}(b):32]_2. Writes an OutputKey to [r32{0}(a):40]_2, with layout [output_commit:32 bytes || output_len_le:8 bytes]. The output_len field is encoded as 8-byte little-endian but is currently constrained to a zero-extended 32-bit value. Deferred outputs must have byte length divisible by DIGEST_SIZE. Also updates the input and output accumulators in address space 4.
OUTPUT_RV32a,b,def_idx,1,2Retrieves the output of a previous deferred call. Reads an OutputKey from [r32{0}(b):40]_2. Writes output_len bytes to [r32{0}(a):output_len]_2, where output_len is the length from the OutputKey. The instruction is only valid for deferred outputs whose byte length is divisible by DIGEST_SIZE.

Keccak Extension

The Keccak extension supports the Keccak-256 hash function via two low-level opcodes that decompose the Keccak sponge construction into its constituent operations: absorbing input (XOR-in) and permuting state (keccak-f[1600]). Together, these opcodes can be used to implement the full Keccak-256 hash function, including incremental/streaming hashing. The opcodes themselves do not perform padding or squeezing.

The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes.

The Keccak-f[1600] state is a 200-byte buffer in address space 2. The absorption rate is 136 bytes (1088 bits), corresponding to a capacity of 512 bits for Keccak-256.

In the instructions below, let buf = r32{0}(a), inp = r32{0}(b), and len = r32{0}(c).

NameOperandsDescription
XORIN_RV32a,b,c,1,2XORs input bytes into a state buffer in-place. For i in 0..len, sets [buf+i]_2 ^= [inp+i]_2. Reads len/4 four-byte words from both the buffer at buf and the input at inp, and writes len/4 four-byte words back to buf. The instruction is only valid when len is a multiple of 4 and at most 136; len = 0 is a no-op.
KECCAKF_RV32a,_,_,1,2Applies the keccak-f[1600] permutation in-place on a 200-byte state buffer: [buf:200]_2 = keccakf([buf:200]_2). Reads 50 four-byte words from buf, applies the permutation, and writes 50 four-byte words back.

SHA-2 Extension

The SHA-2 extension exposes single-block compression instructions for SHA-256 and SHA-512. These are sufficient to implement the SHA-256, SHA-384, and SHA-512 hash functions: SHA-384 uses SHA512_UPDATE_RV32 with the SHA-384 initial state and final 48-byte truncation. The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes. State buffers store 8 words in little-endian order, the destination pointer (a) may alias the state pointer (b), and these instructions do not perform padding.

NameOperandsDescription
SHA256_UPDATE_RV32a,b,c,1,2[r32{0}(a):32]_2 = compress256([r32{0}(b):32]_2, [r32{0}(c):64]_2), where compress256(state, input) is the SHA-256 block compression function which returns the updated state.
SHA512_UPDATE_RV32a,b,c,1,2[r32{0}(a):64]_2 = compress512([r32{0}(b):64]_2, [r32{0}(c):128]_2), where compress512(state, input) is the SHA-512 block compression function which returns the updated state.

BigInt Extension

The BigInt extension supports operations on 256-bit signed and unsigned integers. The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes. Pointers to the representation of the elements are read from address space 1 and the elements themselves are read/written from address space 2.

Note: These instructions are not the same as instructions on 256-bit registers.

256-bit ALU

NameOperandsDescription
ADD256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 + [r32{0}(c):32]_2. Overflow is ignored and the lower 256-bits are written to the destination.
SUB256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 - [r32{0}(c):32]_2. Overflow is ignored and the lower 256-bits are written to the destination.
XOR256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 ^ [r32{0}(c):32]_2
OR256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 | [r32{0}(c):32]_2
AND256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 & [r32{0}(c):32]_2
SLL256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 << [r32{0}(c):32]_2
SRL256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 >> [r32{0}(c):32]_2
SRA256_RV32a,b,c,1,2[r32{0}(a):32]_2 = [r32{0}(b):32]_2 >> [r32{0}(c):32]_2 MSB extends
SLT256_RV32a,b,c,1,2[r32{0}(a):32]_2 = i256([r32{0}(b):32]_2) < i256([r32{0}(c):32]_2) ? 1 : 0
SLTU256_RV32a,b,c,1,2[r32{0}(a):32]_2 = u256([r32{0}(b):32]_2) < u256([r32{0}(c):32]_2) ? 1 : 0

256-bit Branch

NameOperandsDescription
BEQ256_RV32a,b,c,1,2if([r32{0}(a):32]_2 == [r32{0}(b):32]_2) pc += c
BNE256_RV32a,b,c,1,2if([r32{0}(a):32]_2 != [r32{0}(b):32]_2) pc += c
BLT256_RV32a,b,c,1,2if(i256([r32{0}(a):32]_2) < i256([r32{0}(b):32]_2)) pc += c
BGE256_RV32a,b,c,1,2if(i256([r32{0}(a):32]_2) >= i256([r32{0}(b):32]_2)) pc += c
BLTU256_RV32a,b,c,1,2if(u256([r32{0}(a):32]_2) < u256([r32{0}(b):32]_2)) pc += c
BGEU256_RV32a,b,c,1,2if(u256([r32{0}(a):32]_2) >= u256([r32{0}(b):32]_2)) pc += c

256-bit Multiplication

Multiplication performs 256-bit×256-bit multiplication and writes the lower 256-bits to memory. Below x[n:m] denotes the bits from n to m inclusive of x.

NameOperandsDescription
MUL256_RV32a,b,c,1,2[r32{0}(a):32]_2 = ([r32{0}(b):32]_2 * [r32{0}(c):32]_2)[0:255]

Algebra Extension

The algebra extension supports modular arithmetic over arbitrary fields and their complex field extensions. It is configured to specify a list of supported moduli. The configuration of each supported positive integer modulus N includes the associated configuration parameter N::NUM_LIMBS (defined below).

The instructions perform operations on unsigned big integers representing elements in the modulus. The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes. Pointers to the representation of the elements are read from address space 1 and the elements themselves are read/written from address space 2.

An element in the ring of integers modulo N is represented as an unsigned big integer with N::NUM_LIMBS limbs with each limb having LIMB_BITS = 8 bits. For each instruction, the input elements [r32{0}(b): N::NUM_LIMBS]_2, [r32{0}(c):N::NUM_LIMBS]_2 are assumed to be unsigned big integers in little-endian format with each limb having LIMB_BITS bits. However, the big integers are not required to be less than N. Under these conditions, the output element [r32{0}(a): N::NUM_LIMBS]_2 written to memory will be an unsigned big integer of the same format that is congruent modulo N to the respective operation applied to the two inputs.

For each instruction, the operand d is fixed to be 1 and e is fixed to be 2.

NameOperandsDescription
ADDMOD_RV32<N>a,b,c,1,2[r32{0}(a): N::NUM_LIMBS]_2 = [r32{0}(b): N::NUM_LIMBS]_2 + [r32{0}(c): N::NUM_LIMBS]_2 (mod N)
SUBMOD_RV32<N>a,b,c,1,2[r32{0}(a): N::NUM_LIMBS]_2 = [r32{0}(b): N::NUM_LIMBS]_2 - [r32{0}(c): N::NUM_LIMBS]_2 (mod N)
SETUP_ADDSUBMOD_RV32<N>a,b,c,1,2assert([r32{0}(b): N::NUM_LIMBS]_2 == N) for the chip that handles add and sub. For the sake of implementation convenience it also writes something (can be anything) into [r32{0}(a): N::NUM_LIMBS]_2
MULMOD_RV32<N>a,b,c,1,2[r32{0}(a): N::NUM_LIMBS]_2 = [r32{0}(b): N::NUM_LIMBS]_2 * [r32{0}(c): N::NUM_LIMBS]_2 (mod N)
DIVMOD_RV32<N>a,b,c,1,2[r32{0}(a): N::NUM_LIMBS]_2 = [r32{0}(b): N::NUM_LIMBS]_2 / [r32{0}(c): N::NUM_LIMBS]_2 (mod N). Undefined behavior if gcd([r32{0}(c): N::NUM_LIMBS]_2, N) != 1.
SETUP_MULDIVMOD_RV32<N>a,b,c,1,2assert([r32{0}(b): N::NUM_LIMBS]_2 == N) for the chip that handles mul and div. For the sake of implementation convenience it also writes something (can be anything) into [r32{0}(a): N::NUM_LIMBS]_2

Modular Branching

The configuration of N is the same as above. For each instruction, the input elements [r32{0}(b): N::NUM_LIMBS]_2, [r32{0}(c): N::NUM_LIMBS]_2 are assumed to be unsigned big integers in little-endian format with each limb having LIMB_BITS bits.

NameOperandsDescription
ISEQMOD_RV32<N>a,b,c,1,2[a:4]_1 = [r32{0}(b): N::NUM_LIMBS]_2 == [r32{0}(c): N::NUM_LIMBS]_2 (mod N) ? 1 : 0. Enforces that [r32{0}(b): N::NUM_LIMBS]_2, [r32{0}(c): N::NUM_LIMBS]_2 are less than N and then sets the register value of [a:4]_1 to 1 or 0 depending on whether the two big integers are equal.
SETUP_ISEQMOD_RV32<N>a,b,c,1,2assert([r32{0}(b): N::NUM_LIMBS]_2 == N) in the chip that handles modular equality. For the sake of implementation convenience it also writes something (can be anything) into register value of [a:4]_1

Phantom Sub-Instructions

NameDiscriminantOperandsDescription
HintNonQr<N>0x50_,_,c_upperUse c_upper to determine the index of the modulus from the list of supported moduli. Reset the hint stream to equal a quadratic nonresidue modulo N.
HintSqrt<N>0x51a,_,c_upperUse c_upper to determine the index of the modulus from the list of supported moduli. Read from memory x = [r32{0}(a): N::NUM_LIMBS]_2. If x is a quadratic residue modulo N, reset the hint stream to [1u8, 0u8, 0u8, 0u8] followed by a square root of x. If x is not a quadratic residue, reset the hint stream to [0u8; 4] followed by a square root of x * non_qr, where non_qr is the quadratic nonresidue returned by HintNonQr<N>.

Complex Extension Field

A complex extension field Fp2 is the quadratic extension of a prime field Fp with irreducible polynomial X^2 + 1. An element in Fp2 is a pair c0: Fp, c1: Fp such that c0 + c1 u represents a point in Fp2 where u^2 = -1.

The complex extension field Fp2 is supported only if the modular arithmetic instructions for Fp::MODULUS are also supported. The memory layout of Fp2 is then that of two concatenated Fp elements.

We use the following notation below:

r32_fp2(a) -> Fp2 {
    let c0 = [r32{0}(a): Fp::NUM_LIMBS]_2;
    let c1 = [r32{0}(a) + Fp::NUM_LIMBS: Fp::NUM_LIMBS]_2;
    return Fp2 { c0, c1 };
}
NameOperandsDescription
ADD_RV32<Fp2>a,b,c,1,2Set r32_fp2(a) = r32_fp2(b) + r32_fp2(c)
SUB_RV32<Fp2>a,b,c,1,2Set r32_fp2(a) = r32_fp2(b) - r32_fp2(c)
SETUP_ADDSUB_RV32<Fp2>a,b,c,1,2assert(r32_fp2(b).c0 == N) for the chip that handles add and sub. For the sake of implementation convenience it also writes something (can be anything) into r32_fp2(a)
MUL_RV32<Fp2>a,b,c,1,2Set r32_fp2(a) = r32_fp2(b) * r32_fp2(c)
DIV_RV32<Fp2>a,b,c,1,2Set r32_fp2(a) = r32_fp2(b) / r32_fp2(c)
SETUP_MULDIV_RV32<Fp2>a,b,c,1,2assert(r32_fp2(b).c0 == N) for the chip that handles mul and div. For the sake of implementation convenience it also writes something (can be anything) into r32_fp2(a)

Elliptic Curve Extension

The elliptic curve extension supports arithmetic over elliptic curves C in Weierstrass form given by equation C: y^2 = x^3 + C::A * x + C::B where C::A and C::B are constants in the coordinate field. We note that the definitions of the curve arithmetic operations do not depend on C::B. The VM configuration will specify a list of supported curves. For each Weierstrass curve C, the coordinate size is determined by C::Fp::NUM_LIMBS from the coordinate field. The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes.

An affine curve point EcPoint(x, y) is a pair of x,y where each element is an array of C::Fp::NUM_LIMBS elements each with LIMB_BITS = 8 bits. When the coordinate field C::Fp of C is prime, the format of x,y is guaranteed to be the same as the format used in the modular arithmetic instructions. A curve point will be represented as 2 * C::Fp::NUM_LIMBS contiguous cells in memory.

We use the following notation below:

r32_ec_point(a) -> EcPoint {
    let x = [r32{0}(a): C::Fp::NUM_LIMBS]_2;
    let y = [r32{0}(a) + C::Fp::NUM_LIMBS: C::Fp::NUM_LIMBS]_2;
    return EcPoint(x, y);
}
NameOperandsDescription
EC_ADD_NE_RV32<C>a,b,c,1,2Set r32_ec_point(a) = r32_ec_point(b) + r32_ec_point(c) (curve addition). Assumes that r32_ec_point(b), r32_ec_point(c) both lie on the curve and are not the identity point. Further assumes that r32_ec_point(b).x, r32_ec_point(c).x are not equal in the coordinate field.
SETUP_EC_ADD_NE_RV32<C>a,b,c,1,2assert(r32_ec_point(b).x == C::Fp::MODULUS) in the chip for EC ADD. For the sake of implementation convenience it also writes something (can be anything) into [r32{0}(a): 2*C::Fp::NUM_LIMBS]_2. It is required for proper functionality that assert(r32_ec_point(b).x != r32_ec_point(c).x)
EC_DOUBLE_RV32<C>a,b,_,1,2Set r32_ec_point(a) = 2 * r32_ec_point(b). This doubles the input point. Assumes that r32_ec_point(b) lies on the curve and is not the identity point.
SETUP_EC_DOUBLE_RV32<C>a,b,_,1,2assert(r32_ec_point(b).x == C::Fp::MODULUS && r32_ec_point(b).y == C::CURVE_A) in the chip for EC DOUBLE. For the sake of implementation convenience it also writes something (can be anything) into [r32{0}(a): 2*C::Fp::NUM_LIMBS]_2. It is required for proper functionality that assert(r32_ec_point(b).y != 0 mod C::Fp::MODULUS)

Pairing Extension

The pairing extension supports opcodes tailored to accelerate pairing checks using the optimal Ate pairing over certain classes of pairing friendly elliptic curves. For a curve C to be supported, the VM must have enabled instructions for C::Fp and C::Fp2. The currently supported curves are BN254 and BLS12-381. The extension operates on address spaces 1 and 2, meaning all memory cells are constrained to be bytes.

Phantom Sub-Instructions

The pairing extension defines the following phantom sub-instructions.

NameDiscriminantOperandsDescription
HintFinalExp0x30a,b,c_upperUses c_upper = PAIRING_IDX to determine the curve: BN254 = 0, BLS12-381 = 1. a is a pointer to (p_ptr, p_len): (u32, u32) in memory, and b is a pointer to (q_ptr, q_len): (u32, u32) in memory (e.g., p_ptr = [r32{0}(a)..r32{0}(a) + 4]_2). The sub-instruction peeks at P = [p_ptr..p_ptr + p_len * size_of<Fp>() * 2]_2 and Q = [q_ptr..q_ptr + q_len * size_of<Fp2>() * 2]_2 and views P as a list of G1Affine elements and Q as a list of G2Affine elements. It computes the multi-Miller loop on (P, Q) and then the final exponentiation hint (residue_witness, scaling_factor): (Fp12, Fp12). It resets the hint stream to equal (residue_witness, scaling_factor) as NUM_LIMBS * 12 * 2 bytes. Requires p_len = q_len.

Acknowledgements

The design of the (now deprecated) native extension was inspired by Valida with changes suggested by Max Gillet for compatibility with existing ISAs.