/Tomasulo

An out-of-order execution algorithm for pipeline CPU, implemented by verilog

Primary LanguageVerilog

Tomasulo

contributions welcome

End-term project for Computer Organisation Principle Course.
An efficient pipeline CPU based on tomasulo algorithm, implemented in verilog

Installation

$ git clone git@github.com:YanB25/Tomasulo.git

Usage

Manually add all the files in scource/ into vivado and just run.
Add files in test/, set one of them as top to try the testcases.

Algorithm Introduction

From wiki

Tomasulo’s algorithm is a computer architecture hardware algorithm for dynamic scheduling of instructions that allows out-of-order execution and enables more efficient use of multiple execution units. It was developed by Robert Tomasulo at IBM in 1967.

Description

Tomasulo algorithm is an out-of-order execution algorithm for pipeline CPU, which dynamically rearanges the order of instructions to minimize the idle time of execution units such as ALU and RAM.
The major innovations are below

  • Register renaming in hardware
  • Reservation stations for all excution units
  • Common Data Bus to broadcast signals asychronously

The Algorithm is a superior virsion for parellel compared to the use of scoreboarding or other earlier algorithms.

The Whole Picture

Terminology

  1. 块 存储信息的单位。若干有关联的数据放在一起称为块。例如op和func和rs,rd,rt等存储在一起,称为一个块。
  2. 标志位 用于标志“是”或“否”的位。
  3. 行 一行包括一个块和对应的标志位

Instruction Set

支持除分支指令外的大部分常用MIPS指令。 指令集编码 想要支持更多指令

Component

PC & ROM

没有从图中画出来。向指令队列发射指令(当指令队列非满时。)

Instruction Queue

指令队列。由于Tomasulo算法顺序发射指令,故由指令队列保证其发射的顺序性。当一个指令(例如add)对应的器件(ALU)不忙时,指令发射;否则发生结构冲突,需要阻塞等待。
more detail

Commom Data Bus

CDB.所有刚执行完得到的数据的广播都要通过该总线完成。
每个执行器件(如各个ALU),向CDB Helper发送require信号请求广播。
CDB保证其广播信号在一个周期内不发生更改。 more detail

CDB Queue

deprecated. 不使用CDB队列。
见CDB Helper

Register File

寄存器文件。时序电路。
当时钟下降沿到达后: 检查CDB的广播,若该广播的数据被监听,则将数据更新入寄存器文件中。
检查当前指令的rd,记录rd所等待的label。
Details here

Reservation Station

保留站。包括加减ALU保留站和乘除的保留站。

每个CPU周期,一条算逻运算指令将发射到对应ALU保留站处。
该指令要么所有的操作数都已准备好(立即可以被执行,label==0),或部分操作数由label(label != 0)代替,正等待CDB的广播。 more detail

ALU(FPU)

算术逻辑单或和浮点运算单元。
不同的运算需要不同的CPU周期完成。由于该种延迟,基于保留站的乱序执行可以为其大大提速。 more detail

Shining Points

传统Tomasulo教材资料只给出了算法的软件模拟实现或伪代码实现。具体的硬件实现会遇到许多瓶颈。本项目对其中的一些难点做了突破,体现了一些创新性。 more detail

Testcase & Known Bugs

all testcase(s) have been passed.
welcome to pull request to contribute more testcases.
for testcases information or known bugs,
check here

Bugs & Helps

To report a bug or get help, you can Issues page.

Contribute

To offer codes, please contact us in the Issues page.
You can refer to TODO-List find out what the project still need.