A 32 Bit RISC-V Processor Implementation in Verilog
Verilog
Implementing a 32 Bit Risc-V Cpu in Verilog
This is a project on implementing a 32Bit Risc-V Processor in Verilog which is following official RV32I ISA (Instruction Set Architecture)
This Processor has a 5 stage Pipeline along with the Hazard Unit
Description :
- Instruction Set :
This is a 32 Bit Risc-V Architecture so each Instruction is of 32 Bits and
it can execute the following instruction types
R-Type - ADD SUB SLL SLT SLTU XOR SRL SRA OR AND
I-Type - ADDI SLLI SLTI SLTIU XORI SRLI SRAI ORI ANDI
B-Type - BEQ BNE BLTU BGTU
J-Type - JAL
L-Type - LW
S-Type - SW
- Pipelining :
This Processor involves a 5 stage Pipeline Architecture The pipeline steps involving :
IF - Instruction Fetch
ID - Instruction Decode
IEx - Instruction Execute
IMem - Memory Access
IW - Write Back
- Hazard Detection :
As Pipeline comes with Hazards It has a Hazard Unit which can detect some hazards and also
these hazards can be prevented using Forwarding , Stalling and Branch Prediction etc.. This
processor has implemented Forwarding and Stalling for the hazards which are implemented here
- Verilog Implementation :
This processor uses DataPath Controller type implementation in Verilog
Processor Architecture :
Methodology :
ALU ( Arthemetic Logic Unit ) :
All the Arthemetic Shift and Logic Operations are done in the R-Type Instruction set So designing ALU for R-Type and using this for Other Type Instructions
Let there be two 32 bit inputs and one 32 bit outputs
funct7[5]
funct3
Operation
Implementation
0
000
ADD
Output = Input1 + Input2
1
000
SUB
Output = Input1 - Input2
0
001
SLL
Output = Input1 << Input2
0
010
SLT
Output = bool(Input1 < Input2)
0
011
SLTU
Output = bool(sign(Input1) < sign(Input2))
0
100
XOR
Output = Input1 ^ Input2
0
101
SRL
Output = Input1 >> Input2
0
101
SRA
Output = Input1 >>> Input2
1
110
OR
Output = Input1 I Input2
0
111
AND
Output = Input1 & Input2
ISA ( Instruction Set ) :
Instruction is 32 Bit Long it has the following sections :
Opcode : Instruction[6:0]
rd : Instruction[11:7]
funct3 : Instruction[14:12]
rs1 : Instruction[19:15]
rs2 : Instruction[24:20]
funct7 : Instruction[31:25]
Immediate Fields ( Sign Extenstion) :
But some Instructions don't use this fields instead they have Immediate fields and is calculated as
Instruction[31:25]
Instruction[24:20]
Instruction[19:15]
Instruction[14:12]
Instruction[11:7]
Instruction[6:0]
Type
funct7
rs2
rs1
funct3
rd
Opcode
R-Type
imm[11:5]
imm[4:0]
rs1
funct3
rd
Opcode
I-Type
imm[11:5]
rs2
rs1
funct3
imm[4:0]
Opcode
S-Type
imm[12] imm[10:5]
rs2
rs1
funct3
imm[4:1] imm[11]
Opcode
B-Type
imm[11:5]
imm[4:0]
rs1
funct3
rd
Opcode
L-Type
imm[20] imm[10:5]
imm[4:1] imm[11]
imm[19:15]
imm[14:12]
rd
Opcode
J-Type
Opcode :
Opcode is the value which tells what type of Instruction that is getting executed
R - Type : 0110011
I - Type : 0010011
B - Type : 1100011
J - Type : 1101111
L - Type : 0000011
S - Type : 0100011
Instruction Execution :
R Type Instructions get executed as per the ALU
I Type follows the same as ALU but I Type Doesnt have SUB Operation
B Type Instruction uses SUB operation to compare values
S and L Operation uses ADD operation to calculate the Memory Address
J type uses ADD operation to calculate Address of Register File
Register File :
These are the registers that are present inside the cpu and some are used for specific operations and
some of them are temporary registers which are used for data storage
RV32 has 32 Registers each 32 Bit Wide
As there are 32 Regiters 2^5 so 5 Bits Address is required
Instruction Memory :
This is the memory that user writes in it each cell of this memory is 8 Bit Wide and
it contains the Instructions in order of which it gets executed
As Program Counter holds the Address of Instruction Memory and is 32 Bit wide so maximum size of Instruction Memory can be 2^32...
And Output Instruction is 32 Bit Wide Considering it LITTLE ENDIAN CPU
Output is :
{Ins_Mem[PC+3],Ins_Mem[PC+2],Ins_Mem[Pc+1],Ins_Mem[PC]}
Data Memory :
This can be considered as RAM it stores the data
data can be read or written from Data Memory
ALU Result Acts as its address which is 32 Bit so maximum size of Data Memory is 2^32
Program Counter :
Program Counter is the register which has the address of the instruction that is being executed
Its value increments once its instruction gets executed
The PC Value should change depending on JUMP and Branch Type Instructions
Data Forwarding :
Forwarding is a technique that is used to avoid hazards in pipelined processors
These occur when instruction close to each other use the same data
If Both the registers in the Excution and memory cycle are same i.e rs1 or rs2 == rd
Then get the value directly from ALU Result These methods are shown in the processor Architecture
Stalling :
Let us consider we need to do a read operation and the cpu is doing write operation
But CPU doesn't execute read and write at the same time
So, We Stop the pipeline making it repeat the instruction in the next cycle until the issue gets cleared