Execute quantized model on TFLite with QCS6490
Closed this issue · 5 comments
Hi all, I have an AI-BOX with Ubuntu 20.04 from a Qualcomm OEM/ODM with the QCS6490 chipset.
I used the AI Hub website to quantize a YoloV7 model to .tflite model and I'd like to perform the model inference on the QCS6490 device mentioned above.
This is the code that I'm using for:
import numpy as np
import tensorflow as tf
# Load your TFLite model
# Replace 'model.tflite' with the path to your actual model
tflite_model_path = 'yolov7.tflite'
tflite_model = open(tflite_model_path, 'rb').read()
# Set up the TFLite interpreter
# To use the Hexagon DSP with TensorFlow Lite, you would typically need to
# build the TensorFlow Lite Hexagon delegate. However, this script assumes
# that the delegate is already available and part of the TFLite runtime.
interpreter = tf.lite.Interpreter(
model_content=tflite_model,
experimental_delegates=[tf.lite.experimental.load_delegate('libhexagon_delegate.so')]
)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test the model on random input data.
input_shape = input_details[0]['shape']
print(f"[INFO] Input Shape = {input_shape}")
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# Get the output of the model
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
And my question is where can I find, or where do I download the libhexagon_delegate.so
library.
Qualcomm has new AI Engine Direct SDK
to run models on DSP/HTP via QNNTFLiteDelegate
Please following steps below to setup and run tflite models on QCS6490
- Download
AI Engine Direct SDKs
from QPM https://qpm.qualcomm.com/#/main/tools/details/qualcomm_ai_engine_direct- NOTE: We recommend running QPM cli in linux/window environment. As of today running on macOS will not download all required libs and TFLiteDelegate.
- Depending on target device, copy
<QNN_SDK>/libs/<target device>
on device. Let's call thislibs_path
- Depending on target DSP/HTP architecture, copy
<QNN_SDK>/libs/hexagon-v<VERSION>/
on device. Let's call thisskel_libs_path
- You can find this version from spec online e.g. for QCS6490 https://www.qualcomm.com/products/internet-of-things/industrial/industrial-automation/qualcomm-robotics-rb3-platform
- Now set following environment variables before running model on device
export LD_LIBRARY_PATH=<libs_path from point 2>
export ADSL_LIBRARY_PATH=<skel_libs_path from point 3>
- Changes in above tflite script to run it correctly on-device
- Specify
backend_type
during load_delegate as options - Pass
<libs_path>/libQnnTFLiteDelegate.so
to load_delegate
tf.lite.experimental.load_delegate(<libs_path> + "libQnnTFLiteDelegate.so", options={"backend_type":"htp"})
- Specify
Attaching Sample python script and QNN 2.20 libs to try on RB3 Gen2
aarch64-ubuntu-gcc9.4.zip
hexagon-v68.zip
Model-and-scripts.zip
we will update our docs with these instructions soon to make it easy to deploy on IoT platforms
Hi, I have a similar requirement to run a quantized model.tflite on the DSP of QCS6490. I'd like to know if I can still use the above-mentioned method to test it, or if there are any updated methods available for testing? Thanks.
The above mentioned path is the current best way to test this on device outside of using AI Hub profile & inference jobs. Please reach out to us on Slack if you have further questions
HI @ankitha-bm , this issue has been closed. Please share your question by creating a new issue or posting on Slack, we are able to answer your questions much quicker there.