/CNN-Inference-Engine-Quick-View

A quick view of high-performance convolution neural networks (CNNs) inference engines on mobile devices.

CNN-Inference-Engine-Quick-View

A quick view of high-performance convolution neural networks (CNNs) inference engines on mobile devices.

Runtime-speed Comparisons

Data-flow / Graph Optimization

FLOAT32-Support

Framework Main Platform Model Compatibility Detection-Support Speed Benchmarks
Bolt CPU (ARM optimized) / x86 / Mali GPU Caffe / Tensorflow / PyTorch / onnx Y Link
TNN CPU (ARM optimized) / Mali Adreno Apple GPU Caffe / Tensorflow / PyTorch Y Link
PPLNN CPU (ARM/x86 optimized) / Nvidia GPU onnx Y Link / Link
Paddle-Light CPU (ARM optimized) / Mali GPU / FPGA / NPU Paddle / Caffe / onnx Y Link
MNN CPU (ARM optimized) / Mali GPU Caffe / Tensorflow / onnx Y Link
NCNN CPU (ARM optimized) / Mali GPU Caffe / PyTorch / mxnet / onnx Y 3rd party Link / Official Link
MACE CPU (ARM optimized) / Mali GPU / DSP Caffe / Tensorflow / onnx Y Link
TEngine CPU (ARM A72 optimized) Caffe / mxnet Y Link
AutoKernel CPU / GPU/ NPU Caffe / mxnet / Tensorflow / PyTorch / Darknet Y Link
Synet CPU (ARM optimized) / x86 Caffe / PyTorch / Tensorflow / mxnet / onnx Y -
MsnhNet CPU (ARM optimized) / Mali GPU / x86 / TensorRT PyTorch Y Link
ONNX-Runtime CPU / Nvidia GPU onnx Y -
HiAI Kirin CPU / NPU Caffe / Tensorflow Y -
NNIE NPU Caffe Y 1TOPs
Intel-Caffe CPU (Intel optimized) Caffe Y Link
FeatherCNN CPU (ARM optimized) Caffe N Link / unofficial Link
Tensorflowlite CPU (Android optimized) Caffe2 / Tensorflow / onnx Y Link
TensorRT GPU (Volta optimized) Caffe / Tensorflow / onnx Y Link
TVM CPU (ARM optimized) / Mali GPU / FPGA onnx Y -
SNPE CPU (Qualcomm optimized) / GPU / DSP Caffe / Caffe2 / Tensorflow/ onnx Y Link
Pocket-Tensor CPU (ARM/x86 optimized) Keras N Link
ZQCNN CPU Caffe / mxnet Y Link
ARM-NEON-to-x86-SSE CPU (Intel optimized) Intrinsics-Level - -
Simd CPU (all platform optimized) Intrinsics-Level - -
clDNN Intel® Processor Graphics / Iris™ Pro Graphics Caffe / Tennsorflow / mxnet / onnx Y Link

FIX16-Support

Framework Main Platform Model Compatibility Detection-Support Speed Benchmarks
Bolt CPU (ARM optimized) / x86 / Mali GPU Caffe / Tensorflow / PyTorch Y Link
ARM32-SGEMM-LIB CPU (ARM optimized) GEMM Library N Link
TNN CPU (ARM optimized) / Mali Adreno Apple GPU Caffe / Tensorflow / PyTorch Y Link
Yolov2-Xilinx-PYNQ FPGA (Xilinx PYNQ) Yolov2-only Y Link

INT8-Support

Framework Main Platform Model Compatibility Detection-Support Speed Benchmarks
Bolt CPU (ARM optimized) / x86 / Mali GPU Caffe / Tensorflow / PyTorch Y Link
Intel-Caffe CPU (Intel Skylake) Caffe Y Link
TNN CPU (ARM optimized) / Mali Adreno Apple GPU Caffe / Tensorflow / PyTorch Y Link
PPLNN Nvidia GPU optimized onnx Y Link
NCNN CPU (ARM optimized) Caffe / pytorch / mxnet / onnx Y Link
Paddle-Light CPU (ARM optimized) / Mali GPU / FPGA Paddle / Caffe / onnx Y Link
MNN CPU (ARM optimized) / Mali GPU Caffe / Tensorflow / onnx Y Link
Tensorflowlite CPU (ARM) Caffe2 / Tensorflow / onnx Y Link
TensorRT GPU (Volta) Caffe / Tensorflow / onnx Y Link
Gemmlowp CPU (ARM / x86) GEMM Library - -
SNPE DSP (Quantized DLC) Caffe / Caffe2 / Tensorflow/ onnx Y Link
MACE CPU (ARM optimized) / Mali GPU / DSP Caffe / Tensorflow / onnx Y Link
TF2 FPGA Caffe / PyTorch / Tensorflow Y Link
TVM CPU (ARM optimized) / Mali GPU / FPGA onnx Y Link

TERNARY-Support

Framework Main Platform Model Compatibility Detection-Support Speed Benchmarks
Gemmbitserial CPU (ARM / x86) GEMM Library - Link

BINARY-Support

Framework Main Platform Model Compatibility Detection-Support Speed Benchmarks
Bolt CPU (ARM optimized) / x86 / Mali GPU Caffe / Tensorflow / PyTorch Y Link
BMXNET CPU (ARM / x86) / GPU mxnet Y Link
DABNN CPU (ARM) Caffe / Tensorflow / onnx N Link
Espresso GPU - N Link
BNN-PYNQ FPGA (Xilinx PYNQ) - N Link
FINN FPGA (Xilinx) - N Link

NLP-Support

Framework Main Platform Model Compatibility Speed Benchmarks
TurboTransformers CPU / Nvidia GPU PyTorch Link
Bolt CPU / Mali GPU Caffe / onnx Link