/Tengine

Tengine is a lite, high performance, modular inference engine for embedded device

Primary LanguageC++Apache License 2.0Apache-2.0

Tengine Overview

GitHub license Build Status Test Status

Tengine, developed by OPEN AI LAB, is an AI application development platform for AIoT scenarios launched by OPEN AI LAB, which is dedicated to solving the fragmentation problem of aiot industrial chain and accelerating the landing of AI industrialization. Tengine is specially designed for AIoT scenarios, and it has several features, such as cross platform, heterogeneous scheduling, chip bottom acceleration, ultra light weight and independent, and complete development and deployment tool chain. Tengine is compatible with a variety of operating systems and deep learning algorithm framework, which simplifies and accelerates the rapid migration of scene oriented AI algorithm on embedded edge devices, as well as the actual application deployment;

Tengine is composed of five modules: core/operator/serializer/executor/driver.

  • core provides the basic components and functionalities of the system.
  • operator defines the schema of operators, such as convolution, relu, pooling, etc. al. Here is the current support operator list.
  • serializer is to load the saved model. The serializer framework is extensible to support different format, including the customized one. Caffe/ONNX/Tensorflow/MXNet and Tengine models can be loaded directly by Tengine.
  • executor implements the code to run graph and operators. Current version provides a highly optimized implementation for multi A72 cores.
  • driver is the adapter of real H/W and provides service to device executor by HAL API. It is possible for single driver to create multiple devices.

Build and Install

please refer to Wiki

Tengine examples and model zoo

please visit examples for demos on classification/detection and download models from Tengine model zoo (psw: hhgc)

tengine applications is a project for sharing android/linux applications powered by Tengine

Communication && Tech Support

Benchmark

Test on RK3399-1*A72

Model fp32 int8-hybrid int8-e2e
Squeezenet v1.1 55.3ms 48.6ms 44.6ms
Mobilenet v1 108.7ms 74.6ms 64.2ms

More Benchmark data to be added.

Roadmap

2020.5 updated

Feature
  • More examples
  • Web-based convert tool
  • CV API
  • Support more ops of ONNX(PyTorch)
Optimization
  • arm v8.2 fp16 perf ops