cvBench: A C++ repository from lvwenhao2017

Benchmarking Analysis of Vision Kernels on Embedded CPU, GPU and FPGA:

This repository contains benchmark framework for measuring and comparing energy efficiency of different vision kernels on embedded platforms. It aims to provide computer vision community an easy tool to analyze the performance of vision kernels on different hardware architectures and aids with determining which hardware architecture is most suitable for different kind of vision applications.

Repository Structure
Hardware and Software Environments
List of Vision Kernels
Installation
Build Test Codes
Results Summary
Reference
License

Repository Structure

This repository consists of:

.
├── FPGATests
│   └── README.md
├── GPUTests
│   ├── Geometric Transforms 
│   ├── Image Analysis  
│   ├── Image Arithmatic 
│   ├── Image Features 
│   ├── Image Filters 
│   ├── Input Processing 
│   ├── Optical Flow & Depth 
│   └── README.md 
├── PYNQ-ComputerVision
│   ├── applicationCode 
│   │   ├── overlayTests 
│   │   └── unitTests
│   ├── boards
│   │   ├── Pynq-Z1
│   │   ├── Pynq-Z2
│   │   ├── Ultra96
│   │   └── ZCU104
│   ├── components 
│   └── frameworks
│       ├── cmakeModules 
│       └── utilities    
└── README.md

Hardware and Software Environments

Software Environments:
- OpenCV3.x : https://www.opencv.org/
- xfOpenCV 2018.x : https://github.com/Xilinx/xfopencv
- VisionWorks 1.6 : https://developer.nvidia.com/embedded/jetpack
Hardware Platforms:
- FPGA board: Xilinx Zynq ZCU102/ZCU104, PYNQ-Z1/2 or Ultra96.
- GPU board: NVIDIA Jetson TX1/ TX2.

List of Vision Kernels

Input Processing	Image Arithmatic	Filters	Image Analysis	Geometric Transforms	Composite Kernels
combine	AbsDiff	filter2D	calcHist	affine warp	canny
extract	accumulate	box filter	equalizeHist	perspective warp	fast
convertTo	accumulate squared	dilate	integral image	resize	harris
cvtConvert	accumulate weighted	erode	mean std dev	remap	optical flow pyramid
table lookup	add/subtract	median	min/max loc		stereoBM
	mulitply	pyramidUp
	threshold	pyramidDown
	bitwise and,or,xor,not
	magnitude
	phase

Installation

To clone the repository with PYNQ-ComputerVision submodules, open a terminal and execute:

git clone --recursive https://github.com/ISU-RCL/cvBench.git

Build Test Codes

The steps required to build and run unit tests is described in:

Results Summary

In our experiment, we evaluated the performance of vision kernels on two popular platforms for deploying embedded vision applications:

Nvidia Jetson TX2 (256-core Pascal GPU + ARM Cortex-A57 CPU).
Xilinx Zynq UltraScale+ ZCU102 (XCZU9EG FPGA + ARM Cortex-A53 CPU).

The figures below show the energy/frame (in mJ/f) comparison results.

Reference

@inproceedings{EnergyEfficiencyFCCM2019,
  title={Analyzing the Energy-Efficiency of Vision Kernels on Embedded CPU, GPU and FPGA Platforms},
  author={Qasaimeh, Murad and Kristof, Denolf and Jack, Lo and Kees, Vissers and Zambreno, Joseph and Jones, Phillip H},
  booktitle={2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
  year={2019},
  organization={IEEE}
}

License

The source for this project is licensed under the 3-Clause BSD License

lvwenhao2017/cvBench