tensorflow/tensorflow

Tensorflow Lite, python API does not work

Mykheievskyi opened this issue · 54 comments

System information

  • TensorFlow version: 1.9.0
  • Python version: 3.5

Describe the problem

I am try run TFlite model file with Python API (like in example:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/python_api.md), but I get an error:
ImportError: /home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKaiiS2_PKfiPfi

Source code / logs

My code:

import tensorflow as tf

if __name__ == "__main__":
 
   # Load TFLite model and allocate tensors.
   interpreter = tf.contrib.lite.Interpreter(model_path="./mobilenet_v1_0.25_128_quant.tflite")
 
   interpreter.allocate_tensors()
 
   #Get input and output tensors.
   input_details = interpreter.get_input_details()
   output_details = interpreter.get_output_details()

   print(input_details)
   print(output_details)

Log output:

Traceback (most recent call last):
  File "tflite_test.py", line 12, in <module>
    interpreter = tf.contrib.lite.Interpreter(model_path="/home/pi/test/mobilenet_v1_0.25_128_quant/mobilenet_v1_0.25_128_quant.tflite")
  File "/home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter.py", line 50, in __init__
    _interpreter_wrapper.InterpreterWrapper_CreateWrapperCPPFromFile(
  File "/home/pi/.local/lib/python3.5/site-packages/tensorflow/python/util/lazy_loader.py", line 53, in __getattr__
    module = self._load()
  File "/home/pi/.local/lib/python3.5/site-packages/tensorflow/python/util/lazy_loader.py", line 42, in _load
    module = importlib.import_module(self.__name__)
  File "/usr/lib/python3.5/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 986, in _gcd_import
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 673, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 28, in <module>
    _tensorflow_wrap_interpreter_wrapper = swig_import_helper()
  File "/home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 24, in swig_import_helper
    _mod = imp.load_module('_tensorflow_wrap_interpreter_wrapper', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
  File "<frozen importlib._bootstrap>", line 693, in _load
  File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 577, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 914, in create_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: /home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKaiiS2_PKfiPfi

Could you provide more info:

What platform is this? Looks like a raspberry pi? What version? What board? What raspbian version?

I ran into the same problem w/ either Python 2.7 or 3.5 on RPI 3 B running latest Raspbian. See my comment #21109 (comment)

@aselle , @petewarden , I try run this code on Raspberry Pi 3 board with Raspbian 9.0 system on architecture arm7l. Also tried on Artik 710s board with Ubuntu 16.04.5 system on architecture aarm64 and have was the same error.

I'm having the same problem with Raspbian 9.0, Tensorflow 1.9.0 on a Raspberry Pi 3, with Python 2.7.3 and 3.5.3

I managed to build pip package from master branch on a Raspbian Pi 3 B running Raspbian and ran the label_image using Python TF Lite binding example without problem. So, this should be some kind of compilation flag problem when these pip packages were built.

From the error message we ran into,

> c++filt _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKfiiS2_iPfi
tflite::tensor_utils::NeonMatrixBatchVectorMultiplyAccumulate(float const*, int, int, float const*, int, float*, int)

It seems that

tflite::tensor_utils::NeonMatrixBatchVectorMultiplyAccumulate(float const*, int, int, float const*, int, float*, int)

which is located in tensorflow/contrib/lite/kernels/internal/optimized/neon_tensor_utils.cc is not compiled in because USE_NEON is not defined. Or, it just simply was not linked into the .so?

Even i'm facing the same issue. I'm using Raspberry Pi 3 Model B, Raspbian 9.0, Tensorflow 1.9.0, Python 3.5.3. @freedomtan how did you build a pip package?

@himanshurawlani

  1. patience, it's painfully slow to build it natively on RPI 3
  2. prepare requirements as described in TensorFlow's doc
  3. build and install bazel from source, you may wanna prepare paging space and increase the maximum size of the memory allocation pool before build bazel as described in this doc to .
  4. ./configure
  5. build it with bazel, note that I can build it successfully without modifying source with the following command
bazel build --config opt --local_resources 1024.0,0.5,0.5 \
--copt=-mfpu=neon-vfpv4 \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--copt=-DRASPBERRY_PI \
--host_copt=-DRASPBERRY_PI \
//tensorflow/tools/pip_package:build_pip_package

I had a same problem.
sudo pip install --upgrade "tensorflow==1.10.*" solved the problem

when I was working on updating #16175, it came to me that, for the original problem, mostly //tensorflow/contrib/lite/kernels/internal:neon_tensor_utils was not linked in.

Did using the nightly or 1.10 solve your problem @Mykheievskyi?

It has been 14 days with no activity and the awaiting response label was assigned. Is this still an issue?

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

I have the same issue with raspbian 9.0, tensorflow 1.9.0, python 3.5.3 on raspberry pi 3B. I tried to solve it installing tensorflow 1.10.1 and even though i did it, it's still showing the same problem. did anyone solve it?

I could resolve this issue by building the tf :1.10 from source natively in raspberrypi.

Python version 3.5.3
Build Command is below.
bazel build --config opt --local_resources 1024.0,0.5,0.5
--copt=-mfpu=neon-vfpv4
--copt=-ftree-vectorize
--copt=-funsafe-math-optimizations
--copt=-ftree-loop-vectorize
--copt=-fomit-frame-pointer
--copt=-DRASPBERRY_PI
--copt=-D__ARM_NEON__
--copt=-D__ARM_NEON
--host_copt=-DRASPBERRY_PI
//tensorflow/tools/pip_package:build_pip_package

But I face an assertion error

/usr/lib/python3/dist-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "usage_test.py", line 5, in
interpreter = tf.contrib.lite.Interpreter(model_path="mobilenet_v1_1.0_224.tflite")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/lazy_loader.py", line 53, in getattr
module = self._load()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/lazy_loader.py", line 42, in _load
module = importlib.import_module(self.name)
File "/usr/lib/python3.5/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 986, in _gcd_import
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 673, in _load_unlocked
File "", line 673, in exec_module
File "", line 222, in _call_with_frames_removed
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/init.py", line 69, in
from tensorflow.contrib import periodic_resample
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/periodic_resample/init.py", line 22, in
from tensorflow.contrib.periodic_resample.python.ops.periodic_resample_op import periodic_resample
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/periodic_resample/python/ops/periodic_resample_op.py", line 32, in
resource_loader.get_path_to_datafile('_periodic_resample_op.so'))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
ret = load_library.load_op_library(path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/load_library.py", line 73, in load_op_library
exec(wrappers, module.dict)
File "", line 317, in
File "", line 229, in _InitOpDefLibrary
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_registry.py", line 36, in register_op_list
assert _registered_ops[op_def.name] == op_def
AssertionError

Did anyone face this issue before?

I tried to install Tensorflow using PIP install on my Raspberry Pi3 B+ (Raspbian Stretch - June 2018 Version) and when I tried to run the sample label_image.py example I faced the same error. i.e.

(ImportError: /home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKaiiS2_PKfiPfi)

So I tried another way and build the Cross Compile Package using latest Tensorflow code from master branch and after installing the package on pi I am facing this error -

Traceback (most recent call last):
  File "label_image.py", line 37, in <module>
    interpreter = tf.contrib.lite.Interpreter(model_path=args.model_file)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/util/lazy_loader.py", line 53, in __getattr__
    module = self._load()
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/util/lazy_loader.py", line 42, in _load
    module = importlib.import_module(self.__name__)
  File "/usr/lib/python3.4/importlib/__init__.py", line 109, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 2254, in _gcd_import
  File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
  File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1129, in _exec
  File "<frozen importlib._bootstrap>", line 1471, in exec_module
  File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/__init__.py", line 48, in <module>
    from tensorflow.contrib import distribute
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/distribute/__init__.py", line 34, in <module>
    from tensorflow.contrib.distribute.python.tpu_strategy import TPUStrategy
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/distribute/python/tpu_strategy.py", line 27, in <module>
    from tensorflow.contrib.tpu.python.ops import tpu_ops
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/tpu/__init__.py", line 69, in <module>
    from tensorflow.contrib.tpu.python.ops.tpu_ops import *
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/tpu/python/ops/tpu_ops.py", line 39, in <module>
    resource_loader.get_path_to_datafile("_tpu_ops.so"))
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
    ret = load_library.load_op_library(path)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/load_library.py", line 60, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)

tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid name: 

An op that loads optimization parameters into HBM for embedding. Must be
preceded by a ConfigureTPUEmbeddingHost op that sets up the correct
embedding table configuration. For example, this op is used to install
parameters that are loaded from a checkpoint before a training loop is
executed.

I also getting the same error, updating to TF-1.10.* or TF-1.11 does not resolve the problem.
I am using Raspberry Pi3 with Raspbian Stretch and Python3.5

pi@raspberrypi:~/tflite_exp $ python3 tf_lite_cam.py
PATH_TO_LABELS= object_detection/data/mscoco_label_map.pbtxt
Traceback (most recent call last):
File "tf_lite_cam.py", line 48, in
interpreter = tf.contrib.lite.Interpreter(model_path=TF_MODEL)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/lite/python/interpreter.py", line 51, in init
_interpreter_wrapper.InterpreterWrapper_CreateWrapperCPPFromFile(
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/lazy_loader.py", line 53, in getattr
module = self._load()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/lazy_loader.py", line 42, in _load
module = importlib.import_module(self.name)
File "/usr/lib/python3.5/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 986, in _gcd_import
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 673, in _load_unlocked
File "", line 673, in exec_module
File "", line 222, in _call_with_frames_removed
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 28, in
_tensorflow_wrap_interpreter_wrapper = swig_import_helper()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 24, in swig_import_helper
_mod = imp.load_module('_tensorflow_wrap_interpreter_wrapper', fp, pathname, description)
File "/usr/lib/python3.5/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
return _load(spec)
File "", line 693, in _load
File "", line 666, in _load_unlocked
File "", line 577, in module_from_spec
File "", line 914, in create_module
File "", line 222, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKfiiS2_iPfi

Someone tried the miniconda version of tensorflow for ARM ?? https://towardsdatascience.com/stop-installing-tensorflow-using-pip-for-performance-sake-5854f9d9eb0c

If someone is able to generate the .whl with the correct compilations, plz share the .whl

@saurabh-kachhia @stanlee321 yes, it seems that the problem is still there in the official 1.11 pip wheel. I tested with both python 2.7 and 3.5.

Even i am facing the same issue on Raspberry PI 3. Been looking around for some solution.
Also tried cross-compile but in vain. Now trying to compile on pi3.
@freedomtan please let us know is there is a fast fix to it. Eagerly waiting for your response.

Thanks in advance :)

@sahilparekh for 1.9.x to 1.11.x, what I posted in Aug,

bazel build --config opt --local_resources 1024.0,0.5,0.5 \
--copt=-mfpu=neon-vfpv4 \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--copt=-DRASPBERRY_PI \
--host_copt=-DRASPBERRY_PI \
//tensorflow/tools/pip_package:build_pip_package

should work. For master branch, some modification for building AWS SDK may be needed. AWS SDK problem may need something like #22856

It turns out that target "tensor_utils" may have missed neon deps. To correct it, make sure that tensorflow/contrib/lite/kernels/internal:tensor_utils have neon_tensor_utils deps.
For aarch64 cross-compile configuration, do below modification can solve the problem:

diff --git a/tensorflow/contrib/lite/kernels/internal/BUILD b/tensorflow/contrib/lite/kernels/internal/BUILD
index 464163b..d1f4a0e 100644
--- a/tensorflow/contrib/lite/kernels/internal/BUILD
+++ b/tensorflow/contrib/lite/kernels/internal/BUILD
@@ -53,6 +53,13 @@ config_setting(
 )
 
 config_setting(
+    name = "aarch64",
+    values = {
+        "cpu": "aarch64",
+    },
+)
+
+config_setting(
     name = "arm64-v8a",
     values = {
         "cpu": "arm64-v8a",
@@ -448,6 +455,9 @@ cc_library(
         ":arm": [
             ":neon_tensor_utils",
         ],
+        ":aarch64": [
+            ":neon_tensor_utils",
+        ],
         ":arm64-v8a": [
             ":neon_tensor_utils",
         ],

@zhewang95 Your suggestion doesn't help me. I am still facing the same issue.

@saurabh-kachhia @zhewang95 surely it doesn’t work. RPI 3 does have ARMv8/AArch64 cores, but the Raspbian runs 32-bit kernel.

SOLUTION FOR THIS ERROR!

Source Code:
interpreter = tf.contrib.lite.Interpreter(model_path="optimized_graph.tflite")
interpreter.allocate_tensors()

ImportError: /home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKaiiS2_PKfiPfi

just install tensorflow 1.11.0 following the next steps:

$ sudo apt-get install python-pip python3-pip
$ sudo pip3 uninstall tensorflow
$ git clone https://github.com/PINTO0309/Tensorflow-bin.git
$ cd Tensorflow-bin
$ sudo pip3 install tensorflow-1.11.0-cp35-cp35m-linux_armv7l.whl

if it doesn´t work, try to re-format the sd card and do it again

@EmilioMezaE Thank you so much. It works well.
I tested py27 and py35. I was able to load tflite file.
Can i ask how do you find this solution ? I want to know what was the problem.

Thanks @EmilioMezaE for the solution.
Uninstalling the old tensorflow does not work with pip, just force delete it:

rm -rf  /home/pi/.local/lib/python3.5/site-packages/tensorflow

Currently the performance of 'Lite' seems to be 2-3x slower compared to 'regular tensorflow', this is for quantized model. Lite uses only 1 CPU...maybe this is to blame.

@gasparka
My solution is to disable "jemalloc".
https://github.com/PINTO0309/Tensorflow-bin.git
Although I have not tried it yet, enabling "jemalloc" may improve performance.

You're welcome @rky0930 ! I' sorry but i don't know the reason of this problem, i just saw this page https://github.com/PINTO0309/Tensorflow-bin and followed the process

@rky0930 , @EmilioMezaE see my previous comments for the reason and build instructions

I am referring to the suggestion of @freedomtan completely.
Thank you, freedomtan.
I found that activating MPI is meaningful for performance improvement, so now I started recompiling.
It will take about 3 days.

@PINTO0309 let us know how much faster it gets :)

@gasparka

I tried installing the rebuilt binary with "jemalloc" and "MPI" enabled.
Unfortunately, I did not get faster as I expected.
"MPI" seems to be a mechanism to speed up by distributed processing at learning.

【My ENet】 Pure Tensorflow v1.11.0 10.2 sec ---> 9.5 sec
【My UNet】 Tensorflow Lite v1.11.0 11.5 sec ---> 12.1 sec

https://github.com/PINTO0309/Tensorflow-bin.git
tensorflow-1.11.0-cp35-cp35m-linux_armv7l_jemalloc_mpi.whl

Next I will try to validate "XLA JIT" and I will try to verify whether to speed up.
I hope it will work...

@PINTO0309
Have you experimented with the thread count? I see that Lite is stuck on one thread, there is a C++ API for this but nothing in Python.

Could try to hardcode the thread count to 4:

context_.recommended_num_threads = -1;

Thank you @gasparka.

Before enabling "XLA JIT",
I will try to change the hard code
context_.recommended_num_threads = -1;
to
context_.recommended_num_threads = 4;

@gasparka
I tried rebuilding with MultiThread enabled.
However, it seems that Python's Wrapper does not refer to "Thread Count", and the processing speed has not changed.
If it is a C++ program, 4 threads will be used.
Since I do not have the skills to write C++ programs, can you try? ---> gasparka
https://github.com/PINTO0309/Tensorflow-bin.git
tensorflow-1.11.0-cp35-cp35m-linux_armv7l_jemalloc_mpi_multithread.whl

Results of Python program.
【My ENet】 Pure Tensorflow v1.11.0 9.5 sec ---> 9.5 sec
【My UNet】 Tensorflow Lite v1.11.0 12.1 sec ---> 12.5 sec

Next I will try to validate "XLA JIT" and I will try to verify whether to speed up.

I will try to take a look at the C++ stuff this week.

@PINTO0309
Unfortunately i have no time to look into this :(

Tensorflow v1.12.0 rc0, Stand alone installer.
#23082 (comment)

SOLUTION FOR THIS ERROR!

Source Code:
interpreter = tf.contrib.lite.Interpreter(model_path="optimized_graph.tflite")
interpreter.allocate_tensors()

ImportError: /home/pi/.local/lib/python3.5/site-packages/tensorflow/contrib/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils39NeonMatrixBatchVectorMultiplyAccumulateEPKaiiS2_PKfiPfi

just install tensorflow 1.11.0 following the next steps:

$ sudo apt-get install python-pip python3-pip
$ sudo pip3 uninstall tensorflow
$ git clone https://github.com/PINTO0309/Tensorflow-bin.git
$ cd Tensorflow-bin
$ sudo pip3 install tensorflow-1.11.0-cp35-cp35m-linux_armv7l.whl

if it doesn´t work, try to re-format the sd card and do it again

This was the only thing that work for me, something i did wrong was that i kept trying to install the multi-threaded version but that doesnt work on the pi. you need to install the one on the instrucitons.
after that it works just fine i am still testing to see if its better than the regular tensorflow.

We got ours to work by updating interpreter.py to include contrib in the path as follows:
_interpreter_wrapper = LazyLoader(
"_interpreter_wrapper", globals(),
"tensorflow.contrib.lite.python.interpreter_wrapper."
"tensorflow_wrap_interpreter_wrapper")

pylint: enable=g-inconsistent-quotes

for multi-threading stuff, I sent a PR #25748

Thank you for always doing great work, @freedomtan.
I succeeded in building Tensorflow Lite, incorporating your suggestion.
#25120 (comment)

I tried implementing MultiThread with Tensorflow Lite v1.11.0.
It gained 2.5 times the performance.

https://github.com/PINTO0309/Tensorflow-bin/blob/master/tensorflow-1.11.0-cp35-cp35m-linux_armv7l_jemalloc_multithread.whl

$ sudo apt-get install -y libhdf5-dev libc-ares-dev libeigen3-dev
$ sudo pip3 install keras_applications==1.0.7 --no-deps
$ sudo pip3 install keras_preprocessing==1.0.9 --no-deps
$ sudo pip3 install h5py==2.9.0
$ sudo apt-get install -y openmpi-bin libopenmpi-dev
$ sudo pip3 uninstall tensorflow
$ wget -O tensorflow-1.11.0-cp35-cp35m-linux_armv7l.whl https://github.com/PINTO0309/Tensorflow-bin/raw/master/tensorflow-1.11.0-cp35-cp35m-linux_armv7l_jemalloc_multithread.whl
$ sudo pip3 install tensorflow-1.11.0-cp35-cp35m-linux_armv7l.whl

【Required】 Restart the terminal.

Customize "tensorflow/contrib/lite/examples/python/label_image.py".

import argparse
import numpy as np
import time

from PIL import Image

from tensorflow.contrib.lite.python import interpreter as interpreter_wrapper
def load_labels(filename):
  my_labels = []
  input_file = open(filename, 'r')
  for l in input_file:
    my_labels.append(l.strip())
  return my_labels
if __name__ == "__main__":
  floating_model = False
  parser = argparse.ArgumentParser()
  parser.add_argument("-i", "--image", default="/tmp/grace_hopper.bmp", \
    help="image to be classified")
  parser.add_argument("-m", "--model_file", \
    default="/tmp/mobilenet_v1_1.0_224_quant.tflite", \
    help=".tflite model to be executed")
  parser.add_argument("-l", "--label_file", default="/tmp/labels.txt", \
    help="name of file containing labels")
  parser.add_argument("--input_mean", default=127.5, help="input_mean")
  parser.add_argument("--input_std", default=127.5, \
    help="input standard deviation")
  parser.add_argument("--num_threads", default=1, help="number of threads")
  args = parser.parse_args()

  interpreter = interpreter_wrapper.Interpreter(model_path=args.model_file)
  interpreter.allocate_tensors()
  input_details = interpreter.get_input_details()
  output_details = interpreter.get_output_details()
  # check the type of the input tensor
  if input_details[0]['dtype'] == np.float32:
    floating_model = True
  # NxHxWxC, H:1, W:2
  height = input_details[0]['shape'][1]
  width = input_details[0]['shape'][2]
  img = Image.open(args.image)
  img = img.resize((width, height))
  # add N dim
  input_data = np.expand_dims(img, axis=0)
  if floating_model:
    input_data = (np.float32(input_data) - args.input_mean) / args.input_std

  interpreter.set_num_threads(int(args.num_threads))
  interpreter.set_tensor(input_details[0]['index'], input_data)

  start_time = time.time()
  interpreter.invoke()
  stop_time = time.time()

  output_data = interpreter.get_tensor(output_details[0]['index'])
  results = np.squeeze(output_data)
  top_k = results.argsort()[-5:][::-1]
  labels = load_labels(args.label_file)
  for i in top_k:
    if floating_model:
      print('{0:08.6f}'.format(float(results[i]))+":", labels[i])
    else:
      print('{0:08.6f}'.format(float(results[i]/255.0))+":", labels[i])

  print("time: ", stop_time - start_time)

Environment Preparation for MobileNet v1.

$ cd ~;mkdir test
$ curl https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp > ~/test/grace_hopper.bmp
$ curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz | tar xzv -C ~/test mobilenet_v1_1.0_224/labels.txt
$ mv ~/test/mobilenet_v1_1.0_224/labels.txt ~/test/
$ curl http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz | tar xzv -C ~/test
$ cp tensorflow/tensorflow/contrib/lite/examples/python/label_image.py ~/test

Result of x1 Thread.

$ cd ~/test
$ python3 label_image.py \
--num_threads 1 \
--image grace_hopper.bmp \
--model_file mobilenet_v1_1.0_224_quant.tflite \
--label_file labels.txt

0.415686: 653:military uniform
0.352941: 907:Windsor tie
0.058824: 668:mortarboard
0.035294: 458:bow tie, bow-tie, bowtie
0.035294: 835:suit, suit of clothes
time:  0.4152982234954834

Result of x4 Thread.

$ cd ~/test
$ python3 label_image.py \
--num_threads 4 \
--image grace_hopper.bmp \
--model_file mobilenet_v1_1.0_224_quant.tflite \
--label_file labels.txt

0.415686: 653:military uniform
0.352941: 907:Windsor tie
0.058824: 668:mortarboard
0.035294: 458:bow tie, bow-tie, bowtie
0.035294: 835:suit, suit of clothes
time:  0.1647195816040039

Did you ran this on the RaspberryPI 3B?
So no difference? no performace gain? i dont understand the title saying 2.5 but the results are the same?
What about the PI resources was it taking over 90% or still at 25%-30%?

@masterchop

Did you ran this on the RaspberryPI 3B?

Yes. The above performance measurement result is based on RaspberryPi3.

What about the PI resources was it taking over 90% or still at 25%-30%?

25%-30%

It seems that you are misunderstanding something.
"freedomtan" and my implementation is "MultiThread".
It is not "MultiProcess".
Performance will never improve more than 4 times.
4 Core is never used in full.
http://www.dabeaz.com/python/UnderstandingGIL.pdf
https://qiita.com/pumbaacave/items/942f86269b2c56313c15

If you need implementation with 4 cores, implement it yourself.
I am sorry, I do not have the technology of implementation by "C++ and MultiProcess".

@PINTO0309 and @masterchop as far as I can remember only the convolution kernel is multithreaded, so you hit Amdahl's law

@himanshurawlani

  1. patience, it's painfully slow to build it natively on RPI 3
  2. prepare requirements as described in TensorFlow's doc
  3. build and install bazel from source, you may wanna prepare paging space and increase the maximum size of the memory allocation pool before build bazel as described in this doc to .
  4. ./configure
  5. build it with bazel, note that I can build it successfully without modifying source with the following command
bazel build --config opt --local_resources 1024.0,0.5,0.5 \
--copt=-mfpu=neon-vfpv4 \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--copt=-DRASPBERRY_PI \
--host_copt=-DRASPBERRY_PI \
//tensorflow/tools/pip_package:build_pip_package

@freedomtan Thanks. Can I ask what the options --copt=-DRASPBERRY_PI and host_copt=-DRASPBERRY_PI do?

@hoonkai dunno if it's still needed. It was used to build //tensorflow/lite/kernels/internal:neon_tensor_utils (was //tensorflow/contrib/lite/kernels/internal:neon_tensor_utils with neon support). As mentioned in my comment several month ago, you need -DRASPBERRY_PI (or -DARM_NON_MOBILE) to enabled it

@freedomtan : I ran the following command based on the suggestions above:

bazel --host_jvm_args=-Xmx3g --host_jvm_args=-Xms512m \
build --jobs 4 --local_resources 3000,0.7,0.7 \
--config=opt --copt="-funsafe-math-optimizations" \
--copt="-ftree-vectorize" \
--copt="-fomit-frame-pointer" \
--copt=-DARM_NON_MOBILE \
--host_copt=-DARM_NON_MOBILE \
--verbose_failures \
tensorflow/tools/pip_package:build_pip_package

The package build successfully however I still get the undefined symbol error.
Can you please elaborate what exactly you have done to remove that error?

I am compiling tensorflow 1.12.2 with python3.6 on arm64 / aarch64 with Ubuntu 18.04
I had the same result when compiling tensorflow 1.10.0

@PINTO0309 : I've checked yout your repository however there is no description what exactly you have edited to make it work. The binaries you provide are only for armv7, and the one for armv8 is not good for me because I don't have CUDA / Nvidia support. Could you maybe elaborate?

my bad, I've found the right patches:

  1. tensorflow/lite/kernels/internal/BUILD
    In my case it has been _tensorflow/contrib/lite/kernels/internal/BUILD

  2. third_party/aws/BUILD.bazel and tensorflow/BUILD here
    The first file has been third_party/aws.BUILD

@freedomtan @PINTO0309 : Keep up the good work !

I have exactly the same issue with TensorFlow 1.13.1 on Raspberry Pi running Python 3.6.8. Here is the output:

Traceback (most recent call last):
  File "tflite.py", line 20, in <module>
    interpreter = tf.lite.Interpreter(MODEL_PATH)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 54, in __init__
    _interpreter_wrapper.InterpreterWrapper_CreateWrapperCPPFromFile(
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/python/util/lazy_loader.py", line 61, in __getattr__
    module = self._load()
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/python/util/lazy_loader.py", line 44, in _load
    module = importlib.import_module(self.__name__)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 28, in <module>
    _tensorflow_wrap_interpreter_wrapper = swig_import_helper()
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 24, in swig_import_helper
    _mod = imp.load_module('_tensorflow_wrap_interpreter_wrapper', fp, pathname, description)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
  File "<frozen importlib._bootstrap>", line 684, in _load
  File "<frozen importlib._bootstrap>", line 658, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 571, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 922, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils24NeonVectorScalarMultiplyEPKaifPf

Any ideas on how to sort this out?

I have exactly the same issue with TensorFlow 1.13.1 on Raspberry Pi running Python 3.6.8. Here is the output:

Traceback (most recent call last):
  File "tflite.py", line 20, in <module>
    interpreter = tf.lite.Interpreter(MODEL_PATH)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 54, in __init__
    _interpreter_wrapper.InterpreterWrapper_CreateWrapperCPPFromFile(
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/python/util/lazy_loader.py", line 61, in __getattr__
    module = self._load()
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/python/util/lazy_loader.py", line 44, in _load
    module = importlib.import_module(self.__name__)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 28, in <module>
    _tensorflow_wrap_interpreter_wrapper = swig_import_helper()
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 24, in swig_import_helper
    _mod = imp.load_module('_tensorflow_wrap_interpreter_wrapper', fp, pathname, description)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/pi/.pyenv/versions/3.6.8/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
  File "<frozen importlib._bootstrap>", line 684, in _load
  File "<frozen importlib._bootstrap>", line 658, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 571, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 922, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /home/pi/.pyenv/versions/3.6.8/lib/python3.6/site-packages/tensorflow/lite/python/interpreter_wrapper/_tensorflow_wrap_interpreter_wrapper.so: undefined symbol: _ZN6tflite12tensor_utils24NeonVectorScalarMultiplyEPKaifPf

Any ideas on how to sort this out?

Did you solve it ?

I have the exact same issue (same error, same versions).

Thanks

@gaetanbahl : I've applied this two patches:
https://github.com/tensorflow/tensorflow/pull/16175/files
https://github.com/tensorflow/tensorflow/pull/22856/files

I think the first one is the one which is solving the error.

We got ours to work by updating interpreter.py to include contrib in the path as follows:
_interpreter_wrapper = LazyLoader(
"_interpreter_wrapper", globals(),
"tensorflow.contrib.lite.python.interpreter_wrapper."
"tensorflow_wrap_interpreter_wrapper")

pylint: enable=g-inconsistent-quotes

This worked for me. Thanks