quic/ai-hub-models

Execute quantized model on TFLite with QCS6490

Closed this issue · 5 comments

Hi all, I have an AI-BOX with Ubuntu 20.04 from a Qualcomm OEM/ODM with the QCS6490 chipset.

I used the AI Hub website to quantize a YoloV7 model to .tflite model and I'd like to perform the model inference on the QCS6490 device mentioned above.

This is the code that I'm using for:

import numpy as np
import tensorflow as tf

# Load your TFLite model
# Replace 'model.tflite' with the path to your actual model
tflite_model_path = 'yolov7.tflite'
tflite_model = open(tflite_model_path, 'rb').read()

# Set up the TFLite interpreter
# To use the Hexagon DSP with TensorFlow Lite, you would typically need to
# build the TensorFlow Lite Hexagon delegate. However, this script assumes
# that the delegate is already available and part of the TFLite runtime.
interpreter = tf.lite.Interpreter(
    model_content=tflite_model,
    experimental_delegates=[tf.lite.experimental.load_delegate('libhexagon_delegate.so')]
)
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data.
input_shape = input_details[0]['shape']

print(f"[INFO] Input Shape = {input_shape}")

input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()

# Get the output of the model
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

And my question is where can I find, or where do I download the libhexagon_delegate.so library.

Qualcomm has new AI Engine Direct SDK to run models on DSP/HTP via QNNTFLiteDelegate
Please following steps below to setup and run tflite models on QCS6490

  1. Download AI Engine Direct SDKs from QPM https://qpm.qualcomm.com/#/main/tools/details/qualcomm_ai_engine_direct
    • NOTE: We recommend running QPM cli in linux/window environment. As of today running on macOS will not download all required libs and TFLiteDelegate.
  2. Depending on target device, copy <QNN_SDK>/libs/<target device> on device. Let's call this libs_path
  3. Depending on target DSP/HTP architecture, copy <QNN_SDK>/libs/hexagon-v<VERSION>/ on device. Let's call this skel_libs_path
  4. Now set following environment variables before running model on device
export LD_LIBRARY_PATH=<libs_path from point 2>
export ADSL_LIBRARY_PATH=<skel_libs_path from point 3>
  1. Changes in above tflite script to run it correctly on-device
    • Specify backend_type during load_delegate as options
    • Pass <libs_path>/libQnnTFLiteDelegate.so to load_delegate
     tf.lite.experimental.load_delegate(<libs_path> + "libQnnTFLiteDelegate.so", options={"backend_type":"htp"})
    

Attaching Sample python script and QNN 2.20 libs to try on RB3 Gen2
aarch64-ubuntu-gcc9.4.zip
hexagon-v68.zip
Model-and-scripts.zip

we will update our docs with these instructions soon to make it easy to deploy on IoT platforms

Hi, I have a similar requirement to run a quantized model.tflite on the DSP of QCS6490. I'd like to know if I can still use the above-mentioned method to test it, or if there are any updated methods available for testing? Thanks.

kory commented

The above mentioned path is the current best way to test this on device outside of using AI Hub profile & inference jobs. Please reach out to us on Slack if you have further questions

Image
Image
Image
Not able to engage DSP runtime on Rb3 gen2 (QCS6490). kindly need assistance regarding the library files needs to push into device to engage DSP runtime.

HI @ankitha-bm , this issue has been closed. Please share your question by creating a new issue or posting on Slack, we are able to answer your questions much quicker there.