A SystemC Simulator for ReRam-based Neural Network

This project is a simulator for ReRam Crossbar devices. It can be compiled using cmake and running on Windows/Linux.

An updated version called XB-Sim-Star can be found at CRAFT-THU/XB-Sim-Star.

When using this software, please use the following citation:

@ARTICLE{8676276,
  author={Liu, He and Han, Jianhui and Zhang, Youhui},
  journal={IEEE Computer Architecture Letters}, 
  title={A Unified Framework for Training, Mapping and Simulation of ReRAM-Based Convolutional Neural Network Acceleration}, 
  year={2019},
  volume={18},
  number={1},
  pages={63-66},
  doi={10.1109/LCA.2019.2908374}}

Install

Install SystemC Library
- For Linux，download the code to local directory and unzip the source code. Step into its directory, ~/systemc-2.3.2 e.g. and run following commands:
```
 # Init build directory
 mkdir build
 cd build
 # using cmake to compile systemc
 cmake ..
 make
```
(Option)Install Cuda Library
- Follow the instruction step on cuda website.

Configuration

Download this repo into your own path.
For linux，using Cmake to compile the code。This project has already provide CMakeLists.txt, users could change the link libraries according your own needs. Execute following command in shell:

 # Init build directory
 mkdir build
 cd build
 # using cmake to compile 
 # default not using GPU
 cmake .. -DCMAKE_BUILD_TYPE=Release
 # or using GPU
 cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_CUDA=on 
 make
 # copy the executable file to upper directory and run
 cp simulator-windows ../
 ./simulator-windows

Dataset

The dataset used in this project is the Test part in cifar-10 dataset, 10000 pictures in total. And they are already transformed into 3*1024(32*32) and put into input.
Labels of Testset is labels.csv and put under source code directory.

Run

Parameter Configuration

Circuit parameters(config.h)

DA Reference Voltage	AD Reference Voltage	DA width	AD width	Crossbar length	Crossbar width	Number of Crossbar in each Tile
DA_V	AD_V	DA_WIDTH	AD_WIDTH	CROSSBAR_L	CROSSBAR_W	CROSSBAR_N

Neural Network parameters(config.h)

Kernel size	Input data size	Channels of picture	Size of picture	Number of input picture	pooling size
KERNEL_SIZE	INPUT_SIZE(KERNEL_SIZE*KERNEL_SIZE)	CHANNELS_3/32/48/80/128	IMAGE_SIZE_32/16/8	PICTURE_NUM	POOLING_SIZE_1/2/8

Code Generation of Neural Network Structure

This project generate codes by pre-defined template module using Python code cpp_gen.py. The template code of each module is:

`conv_cpp.template`	`conv_buffer_cpp.template`	`linear_cpp.template`	`linear_buffer_cpp.template`
Convolution layer module template	Buffer module template between convolution layer	Fully connected layer module template	Buffer module between fully connected layer

The network needed structure list stored in cpp_gen.py and users can change this file to generate different NN structure

The code generation command is execute_process in CMakeLists.txt, and it is influenced by compile option USE_CUDA. Users don't have to run code generation command separately. The generated code will be put into generated directory.

It will generate the headers of NN structure stage_conv_*.h, conv_buffer_*.h, stage_linear_*.h, linear_buffer_*.h.
And they will be included in headers.cpp, using sc_signal to connect layers in main.cpp.

Performance

Accuracy
- AD/DA module included：83.5%
- with noise：82.4%

Case

Using VGG network to classify CIFAR-10 dataset
- This project generated a VGG network by default, it contains 15 convolution layers and 2 fully connected layers, all of the weights (high dimension convolution kernel) have been transformed into 2-D matrix in CROSSBAR_L*CROSSBAR_W for hte computation of ReRam Crossbar.
- All weight matrices has been transformed before and stored in weights. The weight conversion just like the following figure:

Support module replication
Multi Crossbars in each Tile

面向Crossbar的SystemC模拟器

本项目是面向Crossbar器件的模拟器，支持Visual Studio和cmake编译，可运行在Windows和Linux(Mac)下。

内容列表

安装
数据集
运行
性能
样例
下一步工作

安装

安装SystemC库
- 对于Windows用户，将systemc代码下载到本地之后，解压缩代码包。进入systemc文件夹，如E:\systemc-2.3.2\msvc10\SystemC中，用VS打开SystemC.vcxproj。对打开的项目分别用Debug模式和Release模式编译。
- 对于Linux/Mac用户，将代码下载到本地之后，解压缩代码包。进入systemc文件夹，如~/systemc-2.3.2中，执行如下语句:
```
 # 创建build文件夹
 mkdir build
 cd build
 # 使用cmake对systemc编译
 cmake ..
 make
```
(可选)安装Cuda库
- 可参照cuda官网的的安装教程进行安装。
配置工程文件
- 对于Windows用户，使用Visual Studio进行开发，可根据本项目中提供的Visual Studio的配置文件进行配置。注意，如下配置须自行修改：
  (Debug和Release模式) 项目属性→C/C++→常规→附加包含目录→(SystemC库代码所在目录，如E:\systemc-2.3.2\src)
  (Debug模式) 项目属性→链接器→常规→附加库目录→(Debug库所在目录，如E:\systemc-2.3.2\msvc10\SystemC\Debug)
  (Release模式) 项目属性→链接器→常规→附加库目录→(Release库所在目录，如E:\systemc-2.3.2\msvc10\SystemC\Release)
- 对于Linux/Mac用户，使用Cmake进行代码编译。本项目已提供CMakeLists.txt文件，可根据需要自行修改链接库的路径。编译代码时需执行如下语句：
```
 # 创建build文件夹
 mkdir build
 cd build
 # 使用cmake对systemc编译
 # 默认不使用GPU进行计算
 cmake .. -DCMAKE_BUILD_TYPE=Release
 # 或者使用GPU计算
 cmake .. -DCMAKE_BUILD_TYPE=Release -DUSE_CUDA=on 
 make
 # 将可执行文件拷贝到外层文件夹并执行
 cp simulator-windows ../
 ./simulator-windows
```

数据集

本项目所用测试数据为cifar-10数据集的测试部分，总计10000张图片，已经将输入数据转换为3*1024(32*32)，并且放入input文件夹下。
测试集标签为labels.csv文件，放在项目源代码目录下。

运行

参数配置

电路相关参数(config.h)

DA参考电压	AD参考电压	DA宽度	AD宽度	Crossbar长度	Crossbar宽度	每个Tile中Crossbar个数
DA_V	AD_V	DA_WIDTH	AD_WIDTH	CROSSBAR_L	CROSSBAR_W	CROSSBAR_N

网络模型相关参数(config.h)

卷积核大小	输入数据大小	图像通道数	图像尺寸	输入图片数目	池化大小
KERNEL_SIZE	INPUT_SIZE(KERNEL_SIZE*KERNEL_SIZE)	CHANNELS_3/32/48/80/128	IMAGE_SIZE_32/16/8	PICTURE_NUM	POOLING_SIZE_1/2/8

网络结构代码生成

本项目使用Python代码cpp_gen.py，根据预先写好的模块模板来生成代码。各模块的模板文件为如下：

`conv_cpp.template`	`conv_buffer_cpp.template`	`linear_cpp.template`	`linear_buffer_cpp.template`
卷积层模块模板	卷积层间Buffer模块模板	全连接层模块模板	全连接层间Buffer模块模板

在cpp_gen.py文件中，存放了所需网络的结构列表，可通过修改该文件来生成不同的网络结构。

代码生成语句为CMakeLists.txt文件中的execute_process指令，受编译选项USE_CUDA影响，不需要单独执行代码生成指令。生成的代码将放入generated文件夹下。

可以生成所需的网络结构头文件stage_conv_*.h, conv_buffer_*.h, stage_linear_*.h, linear_buffer_*.h.
在headers.cpp文件中包含以上生成的头文件，并且根据各层之间的连接关系，在main.cpp文件中通过sc_signal进行串联。

性能

准确率
- 包括AD/DA模块：83.5%
- 加入噪音：82.4%

样例

使用VGG-16网络对CIFAR-10数据进行分类
- 本项目默认生成了一个VGG-16网络，包括15个卷积层和2个全连接层，并且所有的权重（高维卷积核）全部转换为CROSSBAR_L*CROSSBAR_W的二维矩阵用于ReRam Crossbar计算。
- 所有权重矩阵已经预先转换完成，并且存放在weights文件夹中。权重矩阵的转换方式如下图所示：

下一步工作

支持模块复用
每个Tile中多个Crossbar的完善

CRAFT-THU/XB-Sim