Niu32 is a RISC 32-bit instruction set that aims to be as simple as possible to understand, well-documented, and easy to write for. A completed assembler is included, written in Python 3.
In Niu32, ops are listed in UPPERCASE, labels in lowercase, and numbers are prefixed by 0x
for hexidecimal values, 0b
for binary values, or listed without a prefix for decimal values.
Registers are prefixed with a $
, and line comments are prefixed with a !
.
Although not a valid register in our instruction set, we will refer to instruction arguments involving registers with the designations $arg1
, $arg2
, and $argD
for source and destination registers, respectively. We will use the value 0xBEEF
for numerical arguments.
If you use Notepad++, importing n32-notepad++-udl.xml
as a User-Defined Language (Language -> Define your language... -> Import...) will add syntax highlighting to a Niu32 assembly program.
An instruction word in Niu32 is 32 bits long. We will start counting from bit 31 (most-significant) to bit 0 (least-significant).
An instruction word can be divided into the following fields:
-
Primary opcode (OP1). 5 bits. Signals the processor what instruction to perform, or alternatively signals the processor to check the secondary opcode to figure out what instruction to perform.
-
Source register arguments (ARG1, ARG2). 5 bits each. Specifies which registers to reference. The values stored in these registers will be used in evaluation of the instruction.
-
Destination register (ARGD). 5 bits. Specifies which register to store the result of the operation after it has completed.
-
Immediate value (IMM). 17 bits. A number used in some types of instructions instead of a secondary register argument. The value given in the instruction will be directly used in the evaluation of the instruction.
-
Secondary opcode (OP2). 5 bits. Signals the processor what instruction to perform. Primarily used in non-immediate ALU instructions, where the secondary opcode is used to specify the
ALUop
signal (see below).
An instruction word can take one of two formats. Fields are shown at the top, and the bits they correspond to are shown at the bottom. Bit ranges are inclusive (i.e. "bits 4-0" include both bit 4 and bit 0).
OP1 | ARG1 | ARG2 | ARGD | empty | OP2 |
---|---|---|---|---|---|
xxxxx | xxxxx | xxxxx | xxxxx | 0000000 | xxxxx |
31-27 | 26-22 | 21-17 | 16-12 | 11-5 | 4-0 |
These are used for instructions which require the use of two argument registers and/or instructions which require a secondary opcode.
OP1 | ARG1 | ARGD | IMM |
---|---|---|---|
xxxxx | xxxxx | xxxxx | xxxxxxxxxxxxxxxxx |
31-27 | 26-22 | 21-17 | 16-0 |
These are used for instructions which require the use of an immediate value.
Niu32 has 32 addressable registers.
Number | Name | Binary | Description |
---|---|---|---|
R0 | $zero | 00000 | A read-only register that will only hold a value of 0. |
R1 | $a0 | 00001 | Argument register 0. Caller pushed. Used for passing arguments to subroutines in an assembly program. |
R2 | $a1 | 00010 | Argument register 1. Caller pushed. Used for passing arguments to subroutines in an assembly program. |
R3 | $a2 | 00011 | Argument register 2. Caller pushed. Used for passing arguments to subroutines in an assembly program. |
R4 | $a3 | 00100 | Argument register 3. Caller pushed. Used for passing arguments to subroutines in an assembly program. |
R5 | $t0 | 00101 | Temporary register 0. Caller saved. Used to hold a temporary value. |
R6 | $t1 | 00110 | Temporary register 1. Caller saved. Used to hold a temporary value. |
R7 | $t2 | 00111 | Temporary register 2. Caller saved. Used to hold a temporary value. |
R8 | $t3 | 01000 | Temporary register 3. Caller saved. Used to hold a temporary value. |
R9 | $t4 | 01001 | Temporary register 4. Caller saved. Used to hold a temporary value. |
R10 | $t5 | 01010 | Temporary register 5. Caller saved. Used to hold a temporary value. |
R11 | $t6 | 01011 | Temporary register 6. Caller saved. Used to hold a temporary value. |
R12 | $t7 | 01100 | Temporary register 7. Caller saved. Used to hold a temporary value. |
R13 | $s0 | 01101 | Saved register 0. Callee saved. Used to hold a temporary/saved value. |
R14 | $s1 | 01110 | Saved register 1. Callee saved. Used to hold a temporary/saved value. |
R15 | $s2 | 01111 | Saved register 2. Callee saved. Used to hold a temporary/saved value. |
R16 | $s3 | 10000 | Saved register 3. Callee saved. Used to hold a temporary/saved value. |
R17 | $s4 | 10001 | Saved register 4. Callee saved. Used to hold a temporary/saved value. |
R18 | $s5 | 10010 | Saved register 5. Callee saved. Used to hold a temporary/saved value. |
R19 | $s6 | 10011 | Saved register 6. Callee saved. Used to hold a temporary/saved value. |
R20 | $s7 | 10100 | Saved register 7. Callee saved. Used to hold a temporary/saved value. |
R21 | $r0 | 10101 | Return value 0. Used to hold a single return value from a subroutine (instead of pushing onto the stack). |
R22 | $r1 | 10110 | Return value 1. Used to hold a single return value from a subroutine (instead of pushing onto the stack). |
R23 | $r2 | 10111 | Return value 2. Used to hold a single return value from a subroutine (instead of pushing onto the stack). |
R24 | $r3 | 11000 | Return value 3. Used to hold a single return value from a subroutine (instead of pushing onto the stack). |
R25 | $ra | 11001 | Return address. Callee saved. Used to hold the return address of the calling routine. |
R26 | $gp | 11010 | Global pointer. Used to point to global variables. |
R27 | $fp | 11011 | Frame pointer. Callee saved. Used to hold the memory location of the current stack frame. |
R28 | $sp | 11100 | Stack pointer. Callee saved. Used to hold the memory location of the next empty position on the stack. |
R29 | $at | 11101 | Assembler temporary. Reserved for assembler use (for example, when evaluating pseudo-ops) |
R30 | $k0 | 11110 | Kernel register 0. Reserved for kernel use (for example, during interrupt handling). |
R31 | $k1 | 11111 | Kernel register 1. Reserved for kernel use (for example, during interrupt handling). |
Niu32's memory is byte and word-addressable. The size of a memory word is 32 bits (4 bytes), so any implementation of a Niu32 ISA must reserve the least-significant 2 bits to select a single byte at a given memory location.
Selection bits | Select |
---|---|
00 | Byte 1 |
01 | Byte 2 |
10 | Byte 3 |
11 | Byte 4 |
For example, a memory can look like the following (with 15 bits of addressability + 2 bits byte selector):
Location | Byte 1 (+0) | Byte 2 (+1) | Byte 3 (+2) | Byte 4 (+3) |
---|---|---|---|---|
0x0000 | 0xDE | 0xAD | 0xBE | 0xEF |
0x0004 | 0xAB | 0xBC | 0xCD | 0xDE |
0x0008 | 0xB0 | 0x0B | 0x55 | 0x66 |
... | ... | ... | ... | ... |
0x7FFF | 0xFF | 0xCC | 0xBB | 0xAA |
Word at 0x0000: 0xDEADBEEF
(32 bits)
Byte at 0x0000: 0xDE
(8 bits)
Byte at 0x0001: 0xAD
(8 bits)
Byte at 0x0002: 0xBE
(8 bits)
Byte at 0x0003: 0xEF
(8 bits)
Below are the defined assembly instructions that have a direct mapping to a 5-bit binary instruction.
The opcode table below summarizes the binary instruction corresponding to each opcode. Most significant bits are to the left, while least significant are to the top.
Why are there spaces in the table? Spaces are left open in the opcode space to allow for instructions to be expanded in future (for example, to add load half-word functionality to the base load operations).
xx | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
---|---|---|---|---|---|---|---|---|
00 | ALUI | ADDI | MLTI | DIVI | ANDI | ORI | XORI | |
01 | SULI | SSLI | SURI | SSRI | ||||
10 | LW | LB | SW | SB | LUI | |||
11 | BEQ | BNE | BLT | BLE | JAL |
Signals the processor to check OP2 for operation to perform. This instruction and encoding of the secondary opcode will be handled by the assembler according to the instruction written in the program (i.e. there should be no difference to the programmer as to how to write an instruction that uses the primary vs. secondary opcode). This should not be written directly in an assembly program, and the assembler will throw an error if encountered!
ADDI $argD, $arg1, imm
$argD <- $arg1 + imm
Adds imm
to $arg1
, and stores the result in $argD
.
MLTI $argD, $arg1, imm
$argD <- $arg1 / imm
Multiplies $arg1
by imm
and stores the result in $argD
.
DIVI $argD, $arg1, imm
$argD <- $arg1 / imm
Divides $arg1
by imm
and stores the result in $argD
.
ANDI $argD, $arg1, imm
$argD <- $arg1 & imm
Performs an AND on $arg1
and imm
and stores the result in $argD
.
ORI $argD, $arg1, imm
$argD <- $arg1 & imm
Performs an OR on $arg1
and imm
and stores the result in $argD
.
XORI $argD, $arg1, imm
$argD <- $arg1 & imm
Performs an XOR on $arg1
and imm
and stores the result in $argD
.
SULI $argD, $arg1, imm
$argD <- $arg1 << imm
Unsigned left-shifts $arg1
by imm
and stores the result in $argD
.
SSLI $argD, $arg1, imm
$argD <- $arg1 <<< imm
Signed left-shifts $arg1
by imm
and stores the result in $argD
.
SURI $argD, $arg1, imm
$argD <- $arg1 >> imm
Unsigned right-shifts $arg1
by imm
and stores the result in $argD
.
SSRI $argD, $arg1, imm
$argD <- $arg1 >>> imm
Signed right-shifts $arg1
by imm
and stores the result in $argD
.
LW $argD, $arg1, imm
$argD <- Mem[$arg1 + 4*imm]
Loads the word at the memory location computed by adding $arg2
and imm
into $argD
.
LB $argD, $arg1, imm
$argD <- Mem[$arg1 + imm]
Loads the byte at the memory location computed by adding $arg1
and imm
into $argD
. In this case, imm
acts as a word offset (i.e. $arg1
+ imm
bytes). Note that the byte will be sign-extended to 32 bits before being stored in $argD
.
SW $arg1, $arg2, imm
Mem[$arg2 + 4*imm] <- $arg1
Stores the word in $arg1
at the memory location computed by adding $arg2
and imm
.
SB $arg1, $arg2, imm
Mem[$arg2 + imm] <- $arg1
Stores the byte in $arg1
at the memory location computed by adding $arg2
and imm
. In this case, imm
acts as a word offset (i.e. $arg1
+ imm
bytes). The value in $arg1
will be shrunk into an 8-bit value before being stored in memory, which may result in undefined behavior if the value does not fit into 8 bits.
LUI $argD, imm
$argD <- imm[17:1]
Loads the most-significant 17 bits of imm
into $argD
. Can be combined with ORI
to load a 32-bit immediate value into a register.
BEQ $arg1, $arg2, imm
$arg1 == $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is equal to $arg2
; otherwise, advances to the next instruction.
BNE $arg1, $arg2, imm
$arg1 != $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is not equal to $arg2
; else, advances to the next instruction.
BLT $arg1, $arg2, imm
$arg1 < $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is less than $arg2
; otherwise, advances to the next instruction.
BLE $arg1, $arg2, imm
$arg1 <= $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is less than or equal to $arg2
; otherwise, advances to the next instruction.
JAL $arg1, $argD
$arg1 <- (PC + 4), PC <- $argD
Jumps to the address of the subroutine stored in $argD
and stores the previous next instruction as the return address in $arg1
.
These instructions are encoded in the OP2 instruction word field (see above).
They will be executed if the OP1 instruction word field is set to ALUI (00000
).
As in the primary opcode table, the opcode table below summarizes the binary instruction corresponding to each opcode. Most significant bits are to the left, while least significant are to the top.
xx | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
---|---|---|---|---|---|---|---|---|
00 | SUB | ADD | MLT | DIV | NOT | AND | OR | XOR |
01 | SUL | SSL | SUR | SSR | ||||
10 | EQ | NEQ | LT | LEQ | ||||
11 |
SUB $argD, $arg1, $arg2
$argD <- $arg1 - $arg2
Subtracts $arg1
from $arg2
, and stores the result in $argD
.
ADD $argD, $arg1, $arg2
$argD <- $arg1 + $arg2
Adds $arg1
to $arg2
, and stores the result in $argD
.
MLT $argD, $arg1, $arg2
$argD <- $arg1 * $arg2
Multiplies $arg1
by $arg2
and stores the result in $argD
.
DIV $argD, $arg1, $arg2
$argD <- $arg1 / $arg2
Divides $arg1
by $arg2
and stores the result in $argD
.
NOT $argD, $arg1
$argD <- ~$arg1
Performs a bitwise NOT on $arg1
, and stores the result in $argD
.
AND $argD, $arg1, $arg2
$argD <- $arg1 & $arg2
Performs a bitwise AND on $arg1
and $arg2
, and stores the result in $argD
.
OR $argD, $arg1, $arg2
$argD <- $arg1 | $arg2
Performs a bitwise OR on $arg1
and $arg2
, and stores the result in $argD
.
XOR $argD, $arg1, $arg2
$argD <- $arg1 ^ $arg2
Performs a bitwise XOR on $arg1
and $arg2
, and stores the result in $argD
.
SUL $argD, $arg1, $arg2
$argD <- $arg1 << $arg2
Unsigned left-shifts $arg1
by $arg2
and stores the result in $argD
.
SSL $argD, $arg1, $arg2
$argD <- $arg1 <<< $arg2
Signed left-shifts $arg1
by $arg2
and stores the result in $argD
.
SUR $argD, $arg1, $arg2
$argD <- $arg1 >> $arg2
Unsigned right-shifts $arg1
by $arg2
and stores the result in $argD
.
SSR $argD, $arg1, $arg2
$argD <- $arg1 >>> $arg2
Signed right-shifts $arg1
by $arg2
and stores the result in $argD
.
EQ $argD, $arg1, $arg2
$argD <- ($arg1 == $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is equal to $arg2
; otherwise stores a 0.
NEQ $argD, $arg1, $arg2
$argD <- ($arg1 != $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is not equal to $arg2
; otherwise stores a 0.
LT $argD, $arg1, $arg2
$argD <- ($arg1 < $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is less than $arg2
; otherwise stores a 0.
LEQ $argD, $arg1, $arg2
$argD <- ($arg1 <= $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is less than or equal to $arg2
; otherwise stores a 0.
The Niu32 assembler provided here takes in an input Niu32 assembly program, and outputs an assembled program in Altera Memory Initialization File (MIF) format.
The assembler is written in Python 3, and as such should be prefixed with python
or python3
, depending on your system's default Python interpreter.
The syntax of the assembler is as follows:
n32-assemble.py <filename> [-o|--output <filename>] [-v|--verbose]
<filename>: The input filename of the Niu32 assembly program to assemble.
-o, --output: The output filename of the assembled program. If none is specified, the default is to strip the extension of the input filename and append .mif
.
-v, --verbose: Print all intermediate output. Default is to surpress all intermediate output except errors.
The format of each instruction in an output MIF is as follows:
-- @ <MEMORY_LOCATION> : <INSTRUCTION>
<INSTRUCTION_NUM> : <ASSEMBLED_INSTRUCTION>
<MEMORY_LOCATION>: The location in memory where this instruction will be stored in. The default is to start from 0x00000000
- however, a memory location can be set manually anywhere in the program with the .ORIG assembler directive.
<INSTRUCTION>: The input instruction, as written.
<INSTRUCTION_NUM>: The index into instruction memory this instruction word can be found at. For example, an instruction at location 0x0000000c
would be found at index 00000003
in a 32-bit instruction memory.
<ASSEMBLED_INSTRUCTION>: The assembled hex instruction.
These are ops which can be used in a Niu32 assembly program, but do not correspond directly to defined opcodes. The assembler will take the responsibility of translating these into actual machine instructions, and the programmer can write these into an assembly program like any other instruction.
SUBI $argD, $arg1, imm
$argD <- $arg1 - imm
Subtracts imm
from $arg1
, and stores the result in $argD
.
The assembler will negate imm
and transform this into an ADDI
instruction.
GT $argD, $arg1, $arg2
$argD <- ($arg1 > $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is greater than $arg2
; otherwise stores a 0.
The assembler will swap the order of $arg1
and $arg2
and transform this into a LT
instruction.
GEQ $argD, $arg1, $arg2
$argD <- ($arg1 >= $arg2) ? 1 : 0
Stores a value of 1 in $argD
if $arg1
is greater than or equal to $arg2
; otherwise stores a 0.
The assembler will swap the order of $arg1
and $arg2
and transform this into a LEQ
instruction.
NAND $argD, $arg1, $arg2
$argD <- ~($arg1 & $arg2)
Performs a NAND on $arg1
and $arg2
and stores the result in $argD
.
The assembler will expand this into two seperate AND
and NOT
instructions.
NOR $argD, $arg1, $arg2
$argD <- ~($arg1 | $arg2)
Performs a NOR on $arg1
and $arg2
and stores the result in $argD
.
The assembler will expand this into two seperate OR
and NOT
instructions.
NXOR $argD, $arg1, $arg2
$argD <- ~($arg1 ^ $arg2)
Performs a NXOR on $arg1
and $arg2
and stores the result in $argD
.
The assembler will expand this into two seperate XOR
and NOT
instructions.
CPY $argD, $arg1
$argD <- $arg1
Copies the value stored in $arg1
into $argD
.
The assembler will transform this into an ADD
instruction.
LA $argD, imm
$argD <- MemLoc(imm)
Stores the memory location of imm
into $argD
.
The assembler will expand this into LUI
and ORI
instructions.
LV $argD, imm
$argD <- imm
Stores the value of imm
into $argD
.
The assembler will expand this into LUI
and ORI
instructions.
CLR $argD
$argD <- $zero
Clears (zeroes-out) the contents of $argD
.
The assembler will transform this into an ADD
instruction.
BGT $arg1, $arg2, imm
$arg1 > $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is greater than $arg2
; otherwise, advances to the next instruction.
The assembler will swap the order of $arg1
and $arg2
and transform this into a BLT
instruction.
BGE $arg1, $arg2, imm
$arg1 >= $arg2 ? PC <- 4*imm : PC <- (PC + 4)
Branches to imm
if $arg1
is greater than or equal to $arg2
; otherwise, advances to the next instruction.
The assembler will swap the order of $arg1
and $arg2
and transform this into a BLE
instruction.
GOTO imm
PC <- 4*imm
Unconditionally branches to imm
.
The assembler will transform this into a BEQ
instruction.
JMP $argD
$ra <- (PC + 4), PC <- $argD
Jumps to the address of the subroutine stored in $argD
and stores the previous next instruction as the return address in $ra
.
The assembler will transform this into a JAL
instruction.
RET
PC <- $ra
Returns the PC to the memory location stored in the $ra
(return address) register. The current PC location will be lost.
The assembler will transform this into a JAL
instruction.
PUSH $arg1
Mem[$sp] <- $arg1, $sp - WORD_SIZE
Pushes the word value of $arg1
onto the stack, and grows the stack pointer (moves up in memory).
The assembler will expand this into SW
and ADDI
instructions.
POP $argD
$sp + WORD_SIZE, $arg1 <- Mem[$sp]
Shrinks the stack pointer (moves down in memory) and pops the word value at the stack pointer into $argD
.
The assembler will expand this into LW
and ADDI
instructions.
Assembler directives are prefixed with a .
, and are not mapped to machine instructions.
.NAME label 0xBEEF
Instructs the assembler to track a new variable in memory with the name (label
) and value (0xBEEF
) specified.
.ORIG 0xBEEF
Instructs the assembler to start the following instructions at the given memory location (0xBEEF
). The assembler will throw an error if the memory location is not a multiple of the word size (4 bytes). Valid memory locations in instruction memory typically end with 0
, 4
, 8
, or c
.
.WORD 0xBEEF
Instructs the assembler to put the given memory word at the assembler's currently tracked location in instruction memory (for example, if the next instruction would be placed at memory location 0x0000000c
, the assembler would place the word 0xBEEF
at that location instead).