TensorflowLite-bin

Prebuilt binary for TensorflowLite's standalone installer. Fast tuning with MultiTread. For RaspberryPi.
Here is the Tensorflow's official README.

If you want the best performance with RaspberryPi4/3, install Ubuntu 18.04+ aarch64 (64bit) instead of Raspbian armv7l (32bit). The official Tensorflow Lite is performance tuned for aarch64. On aarch64 OS, performance is about 4 times higher than on armv7l OS.
How to install Ubuntu 19.10 aarch64 (64bit) on RaspberryPi4 - Qiita - PINTO

The full build package for Tensorflow can be found here (Tensorflow-bin).

TensorFlow Lite will continue to have TensorFlow Lite builtin ops optimized for mobile and embedded devices. However, TensorFlow Lite models can now use a subset of TensorFlow ops when TFLite builtin ops are not sufficient.
1. TensorflowLite-flexdelegate (Tensorflow Select Ops) - Github - PINTO0309
2. Select TensorFlow operators to use in TensorFlow Lite

A repository that shares tuning results of trained models generated by Tensorflow. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization), Quantization-aware training.
PINTO_model_zoo - Github - PINTO0309

Reference articles

My article. Tensorflow Lite v1.14.0 / v1.15.0-rc0 armhf (armv7l) is tuned for MultiThread acceleration and cross-compiled for RaspberryPi on Ubuntu
Please refer to the following URL for details of performance. Post-training quantization with TF2.0 Keras - nb.o’s Diary. The performance evaluation article was created by @Nextremer_nb_o / Github. Thank you.
[Japanese ver.] [Tensorflow Lite] Various Neural Network Model quantization methods for Tensorflow Lite (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization, EdgeTPU). As of May 05, 2020.
[English ver.] [Tensorflow Lite] Various Neural Network Model quantization methods for Tensorflow Lite (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization, EdgeTPU). As of May 05, 2020.

Python API packages

Device	OS	Distribution	Architecture	Python ver	Note
RaspberryPi3/4	Raspbian/Debian	Stretch	armhf / armv7l	3.5	32bit
RaspberryPi3/4	Raspbian/Debian	Buster	armhf / armv7l	3.7 / 2.7	32bit
RaspberryPi3/4	Raspbian/Debian	Stretch	aarch64 / armv8	3.5	64bit
RaspberryPi3/4	Raspbian/Debian	Buster	aarch64 / armv8	3.7 / 2.7	64bit
RaspberryPi3/4	Ubuntu 18.04	Bionic	armhf / armv7l	3.6	32bit
RaspberryPi3/4	Ubuntu 18.04	Bionic	aarch64 / armv8	3.6	64bit

Usage

Python3.5 - Stretch

$ sudo apt install swig libjpeg-dev zlib1g-dev python3-dev python3-numpy \
                   unzip wget python3-pip curl git cmake make
$ wget https://github.com/PINTO0309/TensorflowLite-bin/raw/master/2.2.0/tflite_runtime-2.2.0-cp35-cp35m-linux_armv7l.whl
$ sudo pip3 install --upgrade tflite_runtime-2.2.0-cp35-cp35m-linux_armv7l.whl

Python3.7 - Buster

$ sudo apt install swig libjpeg-dev zlib1g-dev python3-dev python3-numpy \
                   unzip wget python3-pip curl git cmake make
$ wget https://github.com/PINTO0309/TensorflowLite-bin/raw/master/2.2.0/tflite_runtime-2.2.0-cp37-cp37m-linux_armv7l.whl
$ sudo pip3 install --upgrade tflite_runtime-2.2.0-cp37-cp37m-linux_armv7l.whl

Note

Unlike tensorflow this will be installed to a tflite_runtime namespace.
You can then use the Tensorflow Lite interpreter as.

from tflite_runtime.interpreter import Interpreter
### Tensorflow v2.2.0
interpreter = Interpreter(model_path="foo.tflite")
### Tensorflow v2.3.0+
interpreter = Interpreter(model_path="foo.tflite", num_threads=4)

Build parameter

Tensorflow v2.2.0 version or earlier

cd tensorflow/tensorflow/lite/tools/pip_package
make BASE_IMAGE=debian:stretch PYTHON=python3 TENSORFLOW_TARGET=rpi BUILD_DEB=y docker-build
make BASE_IMAGE=debian:buster PYTHON=python3 TENSORFLOW_TARGET=rpi BUILD_DEB=y docker-build
make BASE_IMAGE=debian:stretch PYTHON=python3 TENSORFLOW_TARGET=aarch64 BUILD_DEB=y docker-build
make BASE_IMAGE=debian:buster PYTHON=python3 TENSORFLOW_TARGET=aarch64 BUILD_DEB=y docker-build
make BASE_IMAGE=ubuntu:18.04 PYTHON=python3 TENSORFLOW_TARGET=aarch64 BUILD_DEB=y docker-build
make BASE_IMAGE=ubuntu:18.04 PYTHON=python3 TENSORFLOW_TARGET=rpi BUILD_DEB=y docker-build

Tensorflow v2.3.0-rc0 version or later

git clone -b v2.3.0-rc0 https://github.com/tensorflow/tensorflow.git
cd tensorflow

sudo CI_DOCKER_EXTRA_PARAMS="-e CUSTOM_BAZEL_FLAGS=--define=tflite_pip_with_flex=true \
  -e CI_BUILD_PYTHON=python3 -e CROSSTOOL_PYTHON_INCLUDE_PATH=/usr/include/python3.7" \
  tensorflow/tools/ci_build/ci_build.sh PI-PYTHON37 \
  tensorflow/lite/tools/pip_package/build_pip_package_with_bazel.sh aarch64

sudo CI_DOCKER_EXTRA_PARAMS="-e CUSTOM_BAZEL_FLAGS=--define=tflite_pip_with_flex=true \
  -e CI_BUILD_PYTHON=python3 -e CROSSTOOL_PYTHON_INCLUDE_PATH=/usr/include/python3.7" \
  tensorflow/tools/ci_build/ci_build.sh PI-PYTHON37 \
  tensorflow/lite/tools/pip_package/build_pip_package_with_bazel.sh armhf

sudo CI_DOCKER_EXTRA_PARAMS="-e CUSTOM_BAZEL_FLAGS=--define=tflite_pip_with_flex=true \
  -e CI_BUILD_PYTHON=python3 -e CROSSTOOL_PYTHON_INCLUDE_PATH=/usr/include/python3.5" \
  tensorflow/tools/ci_build/ci_build.sh PI-PYTHON3 \
  tensorflow/lite/tools/pip_package/build_pip_package_with_bazel.sh aarch64

sudo CI_DOCKER_EXTRA_PARAMS="-e CUSTOM_BAZEL_FLAGS=--define=tflite_pip_with_flex=true \
  -e CI_BUILD_PYTHON=python3 -e CROSSTOOL_PYTHON_INCLUDE_PATH=/usr/include/python3.5" \
  tensorflow/tools/ci_build/ci_build.sh PI-PYTHON3 \
  tensorflow/lite/tools/pip_package/build_pip_package_with_bazel.sh armhf

Operation check 【Classification】

Sample of MultiThread x4 by Tensorflow Lite [MobileNetV1 / 75ms]

Sample of MultiThread x4 by Tensorflow Lite [MobileNetV2 / 68ms]

Environmental preparation

$ cd ~;mkdir test
$ curl https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp > ~/test/grace_hopper.bmp
$ curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz | tar xzv -C ~/test mobilenet_v1_1.0_224/labels.txt
$ mv ~/test/mobilenet_v1_1.0_224/labels.txt ~/test/
$ curl http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz | tar xzv -C ~/test
$ cd ~/test

label_image.py

import argparse
import numpy as np
import time

from PIL import Image

from tflite_runtime.interpreter import Interpreter

def load_labels(filename):
  my_labels = []
  input_file = open(filename, 'r')
  for l in input_file:
    my_labels.append(l.strip())
  return my_labels
if __name__ == "__main__":
  floating_model = False
  parser = argparse.ArgumentParser()
  parser.add_argument("-i", "--image", default="/tmp/grace_hopper.bmp", \
    help="image to be classified")
  parser.add_argument("-m", "--model_file", \
    default="/tmp/mobilenet_v1_1.0_224_quant.tflite", \
    help=".tflite model to be executed")
  parser.add_argument("-l", "--label_file", default="/tmp/labels.txt", \
    help="name of file containing labels")
  parser.add_argument("--input_mean", default=127.5, help="input_mean")
  parser.add_argument("--input_std", default=127.5, \
    help="input standard deviation")
  parser.add_argument("--num_threads", default=1, help="number of threads")
  args = parser.parse_args()

  ### Tensorflow -v2.2.0
  interpreter = Interpreter(model_path=args.model_file)
  ### Tensorflow v2.3.0+
  #interpreter = Interpreter(model_path="foo.tflite", num_threads=4)
  interpreter.allocate_tensors()
  input_details = interpreter.get_input_details()
  output_details = interpreter.get_output_details()
  # check the type of the input tensor
  if input_details[0]['dtype'] == np.float32:
    floating_model = True
  # NxHxWxC, H:1, W:2
  height = input_details[0]['shape'][1]
  width = input_details[0]['shape'][2]
  img = Image.open(args.image)
  img = img.resize((width, height))
  # add N dim
  input_data = np.expand_dims(img, axis=0)
  if floating_model:
    input_data = (np.float32(input_data) - args.input_mean) / args.input_std

  ### Tensorflow -v2.2.0
  interpreter.set_num_threads(int(args.num_threads)) #<- Specifies the num of threads assigned to inference
  ### Tensorflow v2.3.0+
  #interpreter.set_num_threads(int(args.num_threads))
  interpreter.set_tensor(input_details[0]['index'], input_data)

  start_time = time.time()
  interpreter.invoke()
  stop_time = time.time()

  output_data = interpreter.get_tensor(output_details[0]['index'])
  results = np.squeeze(output_data)
  top_k = results.argsort()[-5:][::-1]
  labels = load_labels(args.label_file)
  for i in top_k:
    if floating_model:
      print('{0:08.6f}'.format(float(results[i]))+":", labels[i])
    else:
      print('{0:08.6f}'.format(float(results[i]/255.0))+":", labels[i])

  print("time: ", stop_time - start_time)

Inference test

$ python3 label_image.py \
--num_threads 4 \
--image grace_hopper.bmp \
--model_file mobilenet_v1_1.0_224_quant.tflite \
--label_file labels.txt

Operation check 【ObjectDetection】

Sample of MultiThread x4 by Tensorflow Lite + Raspbian Buster (armhf) + RaspberryPi3 [MobileNetV2-SSD / 160ms]

Sample of MultiThread x4 by Tensorflow Lite + Ubuntu18.04 (aarch64) + RaspberryPi3 [MobileNetV2-SSD / 140ms]

Inference test

$ python3 mobilenetv2ssd.py

MobileNetV2-SSD (UINT8) + Corei7 CPU only + USB Camera + 10 Threads + Async

MobileNetV2-SSDLite (UINT8) + RaspberryPi4 CPU only + USB Camera 640x480 + 4 Threads + Sync + Disp 1080p

List of quantized models

https://www.tensorflow.org/lite/guide/hosted_models

Other MobileNetV1 weight files

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

Other MobileNetV2 weight files

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/README.md

Reference

tflite only python package PINTO0309/Tensorflow-bin#15
Incorrect predictions of Mobilenet_V2 tensorflow/tensorflow#31229 (comment)

fangxiaoying/TensorflowLite-bin