/InferenceHelper

C++ Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, OpenVINO, ncnn, MNN, SNPE, Arm NN, NNabla, ONNX Runtime, LibTorch, TensorFlow

Primary LanguageC++Apache License 2.0Apache-2.0

Inference Helper

  • This is a wrapper of deep learning frameworks especially for inference
  • This class provides a common interface to use various deep learnig frameworks, so that you can use the same application code

Supported frameworks

  • TensorFlow Lite
  • TensorFlow Lite with delegate (XNNPACK, GPU, EdgeTPU, NNAPI)
  • TensorRT (GPU, DLA)
  • OpenCV(dnn) (with GPU)
  • OpenVINO with OpenCV (xml+bin)
  • ncnn (with Vulkan)
  • MNN (with Vulkan)
  • SNPE (Snapdragon Neural Processing Engine SDK (Qualcomm Neural Processing SDK for AI v1.51.0))
  • Arm NN
  • NNabla (with CUDA)
  • ONNX Runtime (with CUDA)
  • LibTorch (with CUDA)
  • TensorFlow (with GPU)

Overview

Supported targets

  • Windows 10 (Visual Studio 2019 x64)
  • Linux (x64, armv7, aarch64)
  • Android (armeabi-v7a, arm64-v8a)

CI Status

Framework Windows (x64) Linux (x64) Linux (armv7) Linux (aarch64) Android (aarch64)
CI Windows CI Ubuntu CI Arm CI Arm CI Android
TensorFlow Lite
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
TensorFlow Lite + XNNPACK
  • Build
  • Test
  • Build
  • Test
Unsupported
  • Build
  • Test
  • Build
  • Test
TensorFlow Lite + EdgeTPU
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
Unsupported
TensorFlow Lite + GPU No library No library No library No library
  • Build
  • Test
TensorFlow Lite + NNAPI Unsupported Unsupported Unsupported Unsupported
  • Build
  • Test
TensorRT
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
Unsupported
OpenCV(dnn)
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
OpenVINO with OpenCV
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
Unsupported
ncnn
  • Build
  • Test
  • Build
  • Test
No library No library
  • Build
  • Test
MNN
  • Build
  • Test
  • Build
  • Test
No library
  • Build
  • Test
  • Build
  • Test
SNPE Unsupported Unsupported
  • Build
  • Test
  • Build
  • Test
  • Build
  • Test
Arm NN Unsupported
  • Build
  • Test
Unsupported
  • Build
  • Test
No library
NNabla
  • Build
  • Test
No library Unsupported No library No library
ONNX Runtime
  • Build
  • Test
  • Build
  • Test
Unsupported
  • Build
  • Test
  • Build
  • Test
LibTorch
  • Build
  • Test
  • Build
  • Test
No library No library No library
TensorFlow
  • Build
  • Test
  • Build
  • Test
No library No library No library
  • Unchedked(blank) doesn't mean that the framework is unsupported. Unchecked just means that the framework is not tested in CI. For instance, TensorRT on Windows/Linux works and I confirmed it in my PC, but can't run it in CI.
  • No Library means a pre-built library is not provided so that I cannot confirm it in CI. It may work if you build a library by yourself.

Sample projects

Usage

Please refer to https://github.com/iwatake2222/InferenceHelper_Sample

Installation

  • Add this repository into your project (Using git submodule is recommended)
  • Download prebuilt libraries
    • sh third_party/download_prebuilt_libraries.sh

Additional steps

You need some additional steps if you use the frameworks listed below

Additional steps: OpenCV / OpenVINO

  • Install OpenCV or OpenVINO
    • You may need to set/modify OpenCV_DIR and PATH environment variable
    • To use OpenVINO, you may need to run C:\Program Files (x86)\Intel\openvino_2021\bin\setupvars.bat or source /opt/intel/openvino_2021/bin/setupvars.sh

Additional steps: TensorRT

  • Install CUDA + cuDNN
  • Install TensorRT 8.x

Additional steps: Tensorflow Lite (EdgeTPU)

Additional steps: ncnn

  • Install Vulkan
    • You need Vulkan even if you don't use it because the pre-built libraries require it. Otherwise you need to build libraries by yourself disabling Vulkan
    • https://vulkan.lunarg.com/sdk/home
    • Windows
    • Linux (x64)
      wget https://sdk.lunarg.com/sdk/download/latest/linux/vulkan-sdk.tar.gz
      tar xzvf vulkan-sdk.tar.gz
      export VULKAN_SDK=$(pwd)/1.2.198.1/x86_64
      sudo apt install -y vulkan-utils libvulkan1 libvulkan-dev

Additional steps: SNPE

Note:

  • Debug mode in Visual Studio doesn't work for ncnn, NNabla and LibTorch because debuggable libraries are not provided
    • Debug will cause unexpected bahavior, so use Release or RelWithDebInfo
  • See third_party/download_prebuilt_libraries.sh and third_party/cmakes/* to check which libraries are being used. For instance, libraries without GPU(CUDA/Vulkan) are used to be safe. So, if you want to use GPU, modify these files.

Project settings in CMake

  • Add InferenceHelper to your project
    set(INFERENCE_HELPER_DIR ${CMAKE_CURRENT_LIST_DIR}/../../InferenceHelper/)
    add_subdirectory(${INFERENCE_HELPER_DIR}/inference_helper inference_helper)
    target_include_directories(${LibraryName} PUBLIC ${INFERENCE_HELPER_DIR}/inference_helper)
    target_link_libraries(${LibraryName} InferenceHelper)

CMake options

  • Deep learning framework:

    • You can enable multiple options althoguh the following example enables just one option
    # OpenCV (dnn), OpenVINO
    cmake .. -DINFERENCE_HELPER_ENABLE_OPENCV=on
    # Tensorflow Lite
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE=on
    # Tensorflow Lite (XNNPACK)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_XNNPACK=on
    # Tensorflow Lite (GPU)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_GPU=on
    # Tensorflow Lite (EdgeTPU)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_EDGETPU=on
    # Tensorflow Lite (NNAPI)
    cmake .. -DINFERENCE_HELPER_ENABLE_TFLITE_DELEGATE_NNAPI=on
    # TensorRT
    cmake .. -DINFERENCE_HELPER_ENABLE_TENSORRT=on
    # ncnn, ncnn + vulkan
    cmake .. -DINFERENCE_HELPER_ENABLE_NCNN=on
    # MNN (+ Vulkan)
    cmake .. -DINFERENCE_HELPER_ENABLE_MNN=on
    # SNPE
    cmake .. -DINFERENCE_HELPER_ENABLE_SNPE=on
    # Arm NN
    cmake .. -DINFERENCE_HELPER_ENABLE_ARMNN=on
    # NNabla
    cmake .. -DINFERENCE_HELPER_ENABLE_NNABLA=on
    # NNabla with CUDA
    cmake .. -DINFERENCE_HELPER_ENABLE_NNABLA_CUDA=on
    # ONNX Runtime
    cmake .. -DINFERENCE_HELPER_ENABLE_ONNX_RUNTIME=on
    # ONNX Runtime with CUDA
    cmake .. -DINFERENCE_HELPER_ENABLE_ONNX_RUNTIME_CUDA=on
    # LibTorch
    cmake .. -DINFERENCE_HELPER_ENABLE_LIBTORCH=on
    # LibTorch with CUDA
    cmake .. -DINFERENCE_HELPER_ENABLE_LIBTORCH_CUDA=on
    # TensorFlow
    cmake .. -DINFERENCE_HELPER_ENABLE_TENSORFLOW=on
    # TensorFlow with GPU
    cmake .. -DINFERENCE_HELPER_ENABLE_TENSORFLOW_GPU=on
  • Enable/Disable preprocess using OpenCV:

    • By disabling this option, InferenceHelper is not dependent on OpenCV
    cmake .. -INFERENCE_HELPER_ENABLE_PRE_PROCESS_BY_OPENCV=off

Structure

Class Diagram

APIs

InferenceHelper

Enumeration

typedef enum {
    kOpencv,
    kOpencvGpu,
    kTensorflowLite,
    kTensorflowLiteXnnpack,
    kTensorflowLiteGpu,
    kTensorflowLiteEdgetpu,
    kTensorflowLiteNnapi,
    kTensorrt,
    kNcnn,
    kNcnnVulkan,
    kMnn,
    kSnpe,
    kArmnn,
    kNnabla,
    kNnablaCuda,
    kOnnxRuntime,
    kOnnxRuntimeCuda,
    kLibtorch,
    kLibtorchCuda,
    kTensorflow,
    kTensorflowGpu,
} HelperType;

static InferenceHelper* Create(const HelperType helper_type)

  • Create InferenceHelper instance for the selected framework
std::unique_ptr<InferenceHelper> inference_helper(InferenceHelper::Create(InferenceHelper::kTensorflowLite));

static void PreProcessByOpenCV(const InputTensorInfo& input_tensor_info, bool is_nchw, cv::Mat& img_blob)

  • Run preprocess (convert image to blob(NCHW or NHWC))
  • This is just a helper function. You may not use this function.
    • Available when INFERENCE_HELPER_ENABLE_PRE_PROCESS_BY_OPENCV=on
InferenceHelper::PreProcessByOpenCV(input_tensor_info, false, img_blob);

int32_t SetNumThreads(const int32_t num_threads)

  • Set the number of threads to be used
  • This function needs to be called before initialize
inference_helper->SetNumThreads(4);

int32_t SetCustomOps(const std::vector<std::pair<const char*, const void*>>& custom_ops)

  • Set custom ops
  • This function needs to be called before initialize
std::vector<std::pair<const char*, const void*>> custom_ops;
custom_ops.push_back(std::pair<const char*, const void*>("Convolution2DTransposeBias", (const void*)mediapipe::tflite_operations::RegisterConvolution2DTransposeBias()));
inference_helper->SetCustomOps(custom_ops);

int32_t Initialize(const std::string& model_filename, std::vector& input_tensor_info_list, std::vector& output_tensor_info_list)

  • Initialize inference helper
    • Load model
    • Set tensor information
std::vector<InputTensorInfo> input_tensor_list;
InputTensorInfo input_tensor_info("input", TensorInfo::TENSOR_TYPE_FP32, false);    /* name, data_type, NCHW or NHWC */
input_tensor_info.tensor_dims = { 1, 224, 224, 3 };
input_tensor_info.data_type = InputTensorInfo::kDataTypeImage;
input_tensor_info.data = img_src.data;
input_tensor_info.image_info.width = img_src.cols;
input_tensor_info.image_info.height = img_src.rows;
input_tensor_info.image_info.channel = img_src.channels();
input_tensor_info.image_info.crop_x = 0;
input_tensor_info.image_info.crop_y = 0;
input_tensor_info.image_info.crop_width = img_src.cols;
input_tensor_info.image_info.crop_height = img_src.rows;
input_tensor_info.image_info.is_bgr = false;
input_tensor_info.image_info.swap_color = false;
input_tensor_info.normalize.mean[0] = 0.485f;   /* https://github.com/onnx/models/tree/master/vision/classification/mobilenet#preprocessing */
input_tensor_info.normalize.mean[1] = 0.456f;
input_tensor_info.normalize.mean[2] = 0.406f;
input_tensor_info.normalize.norm[0] = 0.229f;
input_tensor_info.normalize.norm[1] = 0.224f;
input_tensor_info.normalize.norm[2] = 0.225f;
input_tensor_list.push_back(input_tensor_info);

std::vector<OutputTensorInfo> output_tensor_list;
output_tensor_list.push_back(OutputTensorInfo("MobilenetV2/Predictions/Reshape_1", TensorInfo::TENSOR_TYPE_FP32));

inference_helper->initialize("mobilenet_v2_1.0_224.tflite", input_tensor_list, output_tensor_list);

int32_t Finalize(void)

  • Finalize inference helper
inference_helper->Finalize();

int32_t PreProcess(const std::vector& input_tensor_info_list)

  • Run preprocess
  • Call this function before invoke
  • Call this function even if the input data is already pre-processed in order to copy data to memory
  • Note : Some frameworks don't support crop, resize. So, it's better to resize image before calling preProcess.
inference_helper->PreProcess(input_tensor_list);

int32_t Process(std::vector& output_tensor_info_list)

  • Run inference
inference_helper->Process(output_tensor_info_list)

TensorInfo (InputTensorInfo, OutputTensorInfo)

Enumeration

enum {
    kTensorTypeNone,
    kTensorTypeUint8,
    kTensorTypeInt8,
    kTensorTypeFp32,
    kTensorTypeInt32,
    kTensorTypeInt64,
};

Properties

std::string name;           // [In] Set the name_ of tensor
int32_t     id;             // [Out] Do not modify (Used in InferenceHelper)
int32_t     tensor_type;    // [In] The type of tensor (e.g. kTensorTypeFp32)
std::vector<int32_t> tensor_dims;    // InputTensorInfo:   [In] The dimentions of tensor. (If empty at initialize, the size is updated from model info.)
                                     // OutputTensorInfo: [Out] The dimentions of tensor is set from model information
bool        is_nchw;        // [IN] NCHW or NHWC

InputTensorInfo

Enumeration

enum {
    kDataTypeImage,
    kDataTypeBlobNhwc,  // data_ which already finished preprocess(color conversion, resize, normalize_, etc.)
    kDataTypeBlobNchw,
};

Properties

void*   data;      // [In] Set the pointer to image/blob
int32_t data_type; // [In] Set the type of data_ (e.g. kDataTypeImage)

struct {
    int32_t width;
    int32_t height;
    int32_t channel;
    int32_t crop_x;
    int32_t crop_y;
    int32_t crop_width;
    int32_t crop_height;
    bool    is_bgr;        // used when channel == 3 (true: BGR, false: RGB)
    bool    swap_color;
} image_info;              // [In] used when data_type_ == kDataTypeImage

struct {
    float mean[3];
    float norm[3];
} normalize;              // [In] used when data_type_ == kDataTypeImage

OutputTensorInfo

Properties

void* data;     // [Out] Pointer to the output data_
struct {
    float   scale;
    uint8_t zero_point;
} quant;        // [Out] Parameters for dequantization (convert uint8 to float)

float* GetDataAsFloat()

  • Get output data in the form of FP32
  • When tensor type is INT8 (quantized), the data is converted to FP32 (dequantized)
const float* val_float = output_tensor_list[0].GetDataAsFloat();

License

Acknowledgements

  • This project utilizes OSS (Open Source Software)