/FeatherCNN

FeatherCNN is a high performance inference engine for convolutional neural networks.

Primary LanguageC++OtherNOASSERTION

license Release Version PRs Welcome

Introduction

FeatherCNN is a high-performance lightweight CNN inference library, developed by Tencent AI Platform Department. FeatureCNN origins from our Game AI project for King of Glory (Chinese: 王者荣耀), in which we aim to build a neural model for MOBA Game AI, deploy it with the game, and run it on mobile devices. FeatherCNN targets at ARM CPUs.

Comparing with other libraries, FeatherCNN has the following features:

  • High Performance FeatherCNN delivers state-of-the-art inference computing performance on a wide range of devices, including mobile phones (iOS/Android), embedded devices (Linux) as well as ARM-based servers (Linux).

  • Easy Deployment FeatherCNN packs everything in a single code base to get rid of third-party dependencies. Hence, it facilitates deployment on mobile platforms.

  • Featherweight The compiled FeatherCNN library is small-sized (hundreds of KBs).

Please kindly open an issue in this repo for bug reports and enhancement suggests. We are grateful to user responses and will actively polish this library.

Quick guide on Ubuntu host and ARM-Linux targets.

If you are using Ubuntu and want to test on an ARM-Linux devices, here's a quick guide.

Host side compilation

  • Install compilers
sudo apt-get install cmake
sudo apt-get install g++-aarch64-linux-gnu
  • Download source code
git clone http://github.com/tencent/FeatherCNN
  • Compiling and Install
cd FeatherCNN
./build_scripts/build_linux.sh
./build_scripts/build_linux_test.sh

tips: If you are using OpenCL GPU, make sure you have added FEATHER_OPENCL macro when compile the apps.

Devide-side test example

The following command will run a benchmark with respect to specific network, input data, loop count and thread numbers. You can also check results with this program.

./feather_benchmark [feathermodel] [input_data] [loops] [threads number]

An example:

./feather_benchmark ./data/mobilenet.feathermodel ./data/input_3x224x224.txt 20 4

Detailed Instructions for iOS/Android/Linux

Build From Source

iOS Guide

Android Guide

Android ADB Guide

Usage

Model Format Conversion

FeatherCNN accepts Caffemodels. It merges the structure file (.prototxt) and the weight file (.caffemodel) into a single binary model (.feathermodel). The convert tool requires protobuf, but you don't need them for the library.

Model Convert Guide.

Runtime Interfaces

The basic user interfaces are listed in feather/net.h. Currently we are using raw pointers to reference data. We may provide more convenient interfaces in the near future.

Before inference, FeatherCNN requires two steps to initialize the network.

feather::Net<DATA_TYPE> forward_net(num_threads, DEVICE_TYPE);
forward_net.InitFromPath(FILE_PATH_TO_FEATHERMODEL);

The net can also be initialized with raw buffers and FILE pointers. For the usage of CPU, DATA_TYPE should set to be float while GPU can handle both float and uint16_t (aka. fp16/half). The options of DEVICE_TYPE are:

  • DeviceType::CPU
  • DeviceType::GPU_CL
  • DeviceType::GPU_GL (not implemented yet, contributors are welcome)

We can perform forward computation with raw float* buffer consequently.

forward_net.Forward(PTR_TO_YOUR_INPUT_DATA);

The output can be extracted from the net by the name of blobs. The blob names are kept consistent with caffe prototxt.

forward_net.ExtractBlob(PTR_TO_YOUR_OUTPUT_BUFFER, BLOB_NAME);

BTW, you can also get the blob's data size by calling

size_t data_size = 0;
forward_net.GetBlobDataSize(&data_size, BLOB_NAME);

tips: The GPU_CL implementation does not contain softmax layer. Please remove softmax layer from caffe prototxt if you are using GPU. User should compute softmax manuanlly with the output of network.

Performance Benchmarks

We have tested FeatherCNN on a bunch of devices, see this page for details.

User Groups

Telegram: https://t.me/FeatherCNN

QQ: 728147343