Hardware accelerator for convolutional neural networks implemented in Verilog HDL and the C programming language. For more information about this accelerator check the following link https://repositorio.uniandes.edu.co/bitstream/handle/1992/55502/26239.pdf?sequence=1.
The dataflow architecture of the accelerator is an adaptation of the the BSM (broadcast, stay, migration) dataflow introduced by Jihyuck Jo et al., which is Energy-Efficient because it reduces the number of redundant accesses to the off-chip memory.
The convolution accelerator architecture was deployed in the FPGA DE0-Nano-Soc in conjunction with a NIOS II processor, an On-Chip Ram, and an On-Chip Dual Port Ram connected via an Avalon interconnect fabric. Intel Fpga Monitor Software Program was used to read the results of the convolution performed by the accelerator on the on-chip dual port ram.