/ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Qualcomm® AI Hub Models

Release Tag PyPi Python 3.9, 3.10, 3.11, 3.12

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for deployment on Qualcomm® devices.

See supported: On-Device Runtimes, Hardware Targets & Precision, Chipsets, Devices

 

Setup

1. Install Python Package

The package is available via pip:

# NOTE for Snapdragon X Elite users:
# Only AMDx64 (64-bit) Python in supported on Windows.
# Installation will fail when using Windows ARM64 Python.

pip install qai_hub_models

Some models (e.g. YOLOv7) require additional dependencies that can be installed as follows:

pip install "qai_hub_models[yolov7]"

 

2. Configure AI Hub Access

Many features of AI Hub Models (such as model compilation, on-device profiling, etc.) require access to Qualcomm® AI Hub:

 

Getting Started

Export and Run A Model on a Physical Device

All models in our directory can be compiled and profiled on a hosted Qualcomm® device:

pip install "qai_hub_models[yolov7]"

python -m qai_hub_models.models.yolov7.export [--target-runtime ...] [--device ...] [--help]

Using Qualcomm® AI Hub, the export script will:

  1. Compile the model for the chosen device and target runtime (see: Compiling Models on AI Hub).
  2. If applicable, Quantize the model (see: Quantization on AI Hub)
  3. Profile the compiled model on a real device in the cloud (see: Profiling Models on AI Hub).
  4. Run inference with a sample input data on a real device in the cloud, and compare on-device model output with PyTorch output (see: Running Inference on AI Hub)
  5. Download the compiled model to disk.

 

End-To-End Model Demos

Most models in our directory contain CLI demos that run the model end-to-end:

pip install "qai_hub_models[yolov7]"
# Predict and draw bounding boxes on the provided image
python -m qai_hub_models.models.yolov7.demo [--image ...] [--on-device] [--help]

End-to-end demos:

  1. Preprocess human-readable input into model input
  2. Run model inference
  3. Postprocess model output to a human-readable format

Many end-to-end demos use AI Hub to run inference on a real cloud-hosted device (if the --on-device flag is set). All end-to-end demos also run locally via PyTorch.

 

Sample Applications

Native applications that can run our models (with pre- and post-processing) on physical devices are published in the AI Hub Apps repository.

Python applications are defined for all models (from qai_hub_models.models.<model_name> import App). These apps wrap model inference with pre- and post-processing steps written using torch & numpy. These apps are optimized to be an easy-to-follow example, rather than to minimize prediction time.

 

Model Support Data

On-Device Runtimes

Runtime Supported OS
Qualcomm AI Engine Direct Android, Linux, Windows
LiteRT (TensorFlow Lite) Android, Linux
ONNX Android, Linux, Windows

Device Hardware & Precision

Device Compute Unit Supported Precision
CPU FP32, INT16, INT8
GPU FP32, FP16
NPU (includes Hexagon DSP, HTP) FP16*, INT16, INT8

*Some older chipsets do not support fp16 inference on their NPU.

Chipsets

and many more.

Devices

  • Samsung Galaxy S21, S22, S23, and S24 Series
  • Xiaomi 12 and 13
  • Snapdragon X Elite CRD (Compute Reference Device)
  • Qualcomm RB3 Gen 2, RB5

and many more.

 

Model Directory

Computer Vision

Model README
Image Classification
Beit qai_hub_models.models.beit
ConvNext-Base qai_hub_models.models.convnext_base
ConvNext-Tiny qai_hub_models.models.convnext_tiny
ConvNext-Tiny-w8a16-Quantized qai_hub_models.models.convnext_tiny_w8a16_quantized
ConvNext-Tiny-w8a8-Quantized qai_hub_models.models.convnext_tiny_w8a8_quantized
DenseNet-121 qai_hub_models.models.densenet121
DenseNet-121-Quantized qai_hub_models.models.densenet121_quantized
EfficientNet-B0 qai_hub_models.models.efficientnet_b0
EfficientNet-B4 qai_hub_models.models.efficientnet_b4
EfficientNet-V2-s qai_hub_models.models.efficientnet_v2_s
EfficientViT-b2-cls qai_hub_models.models.efficientvit_b2_cls
EfficientViT-l2-cls qai_hub_models.models.efficientvit_l2_cls
GoogLeNet qai_hub_models.models.googlenet
GoogLeNetQuantized qai_hub_models.models.googlenet_quantized
Inception-v3 qai_hub_models.models.inception_v3
Inception-v3-Quantized qai_hub_models.models.inception_v3_quantized
MNASNet05 qai_hub_models.models.mnasnet05
MobileNet-v2 qai_hub_models.models.mobilenet_v2
MobileNet-v2-Quantized qai_hub_models.models.mobilenet_v2_quantized
MobileNet-v3-Large qai_hub_models.models.mobilenet_v3_large
MobileNet-v3-Large-Quantized qai_hub_models.models.mobilenet_v3_large_quantized
MobileNet-v3-Small qai_hub_models.models.mobilenet_v3_small
Mobile_Vit qai_hub_models.models.mobile_vit
RegNet qai_hub_models.models.regnet
RegNetQuantized qai_hub_models.models.regnet_quantized
ResNeXt101 qai_hub_models.models.resnext101
ResNeXt101Quantized qai_hub_models.models.resnext101_quantized
ResNeXt50 qai_hub_models.models.resnext50
ResNeXt50Quantized qai_hub_models.models.resnext50_quantized
ResNet101 qai_hub_models.models.resnet101
ResNet101Quantized qai_hub_models.models.resnet101_quantized
ResNet18 qai_hub_models.models.resnet18
ResNet18Quantized qai_hub_models.models.resnet18_quantized
ResNet50 qai_hub_models.models.resnet50
ResNet50Quantized qai_hub_models.models.resnet50_quantized
Shufflenet-v2 qai_hub_models.models.shufflenet_v2
Shufflenet-v2Quantized qai_hub_models.models.shufflenet_v2_quantized
SqueezeNet-1_1 qai_hub_models.models.squeezenet1_1
SqueezeNet-1_1Quantized qai_hub_models.models.squeezenet1_1_quantized
Swin-Base qai_hub_models.models.swin_base
Swin-Small qai_hub_models.models.swin_small
Swin-Tiny qai_hub_models.models.swin_tiny
VIT qai_hub_models.models.vit
VITQuantized qai_hub_models.models.vit_quantized
WideResNet50 qai_hub_models.models.wideresnet50
WideResNet50-Quantized qai_hub_models.models.wideresnet50_quantized
Image Editing
AOT-GAN qai_hub_models.models.aotgan
LaMa-Dilated qai_hub_models.models.lama_dilated
Super Resolution
ESRGAN qai_hub_models.models.esrgan
QuickSRNetLarge qai_hub_models.models.quicksrnetlarge
QuickSRNetLarge-Quantized qai_hub_models.models.quicksrnetlarge_quantized
QuickSRNetMedium qai_hub_models.models.quicksrnetmedium
QuickSRNetMedium-Quantized qai_hub_models.models.quicksrnetmedium_quantized
QuickSRNetSmall qai_hub_models.models.quicksrnetsmall
QuickSRNetSmall-Quantized qai_hub_models.models.quicksrnetsmall_quantized
Real-ESRGAN-General-x4v3 qai_hub_models.models.real_esrgan_general_x4v3
Real-ESRGAN-x4plus qai_hub_models.models.real_esrgan_x4plus
SESR-M5 qai_hub_models.models.sesr_m5
SESR-M5-Quantized qai_hub_models.models.sesr_m5_quantized
XLSR qai_hub_models.models.xlsr
XLSR-Quantized qai_hub_models.models.xlsr_quantized
Semantic Segmentation
DDRNet23-Slim qai_hub_models.models.ddrnet23_slim
DeepLabV3-Plus-MobileNet qai_hub_models.models.deeplabv3_plus_mobilenet
DeepLabV3-Plus-MobileNet-Quantized qai_hub_models.models.deeplabv3_plus_mobilenet_quantized
DeepLabV3-ResNet50 qai_hub_models.models.deeplabv3_resnet50
EfficientViT-l2-seg qai_hub_models.models.efficientvit_l2_seg
FCN-ResNet50 qai_hub_models.models.fcn_resnet50
FCN-ResNet50-Quantized qai_hub_models.models.fcn_resnet50_quantized
FFNet-122NS-LowRes qai_hub_models.models.ffnet_122ns_lowres
FFNet-40S qai_hub_models.models.ffnet_40s
FFNet-40S-Quantized qai_hub_models.models.ffnet_40s_quantized
FFNet-54S qai_hub_models.models.ffnet_54s
FFNet-54S-Quantized qai_hub_models.models.ffnet_54s_quantized
FFNet-78S qai_hub_models.models.ffnet_78s
FFNet-78S-LowRes qai_hub_models.models.ffnet_78s_lowres
FFNet-78S-Quantized qai_hub_models.models.ffnet_78s_quantized
FastSam-S qai_hub_models.models.fastsam_s
FastSam-X qai_hub_models.models.fastsam_x
MediaPipe-Selfie-Segmentation qai_hub_models.models.mediapipe_selfie
SINet qai_hub_models.models.sinet
Segment-Anything-Model qai_hub_models.models.sam
Unet-Segmentation qai_hub_models.models.unet_segmentation
YOLOv11-Segmentation qai_hub_models.models.yolov11_seg
YOLOv8-Segmentation qai_hub_models.models.yolov8_seg
Object Detection
Conditional-DETR-ResNet50 qai_hub_models.models.conditional_detr_resnet50
DETR-ResNet101 qai_hub_models.models.detr_resnet101
DETR-ResNet101-DC5 qai_hub_models.models.detr_resnet101_dc5
DETR-ResNet50 qai_hub_models.models.detr_resnet50
DETR-ResNet50-DC5 qai_hub_models.models.detr_resnet50_dc5
Facial-Attribute-Detection qai_hub_models.models.face_attrib_net
Facial-Attribute-Detection-Quantized qai_hub_models.models.face_attrib_net_quantized
Lightweight-Face-Detection qai_hub_models.models.face_det_lite
Lightweight-Face-Detection-Quantized qai_hub_models.models.face_det_lite_quantized
MediaPipe-Face-Detection qai_hub_models.models.mediapipe_face
MediaPipe-Face-Detection-Quantized qai_hub_models.models.mediapipe_face_quantized
MediaPipe-Hand-Detection qai_hub_models.models.mediapipe_hand
PPE-Detection qai_hub_models.models.gear_guard_net
PPE-Detection-Quantized qai_hub_models.models.gear_guard_net_quantized
Person-Foot-Detection qai_hub_models.models.foot_track_net
Person-Foot-Detection-Quantized qai_hub_models.models.foot_track_net_quantized
YOLOv11-Detection qai_hub_models.models.yolov11_det
YOLOv8-Detection qai_hub_models.models.yolov8_det
YOLOv8-Detection-Quantized qai_hub_models.models.yolov8_det_quantized
Yolo-NAS qai_hub_models.models.yolonas
Yolo-NAS-Quantized qai_hub_models.models.yolonas_quantized
Yolo-v3 qai_hub_models.models.yolov3
Yolo-v6 qai_hub_models.models.yolov6
Yolo-v7 qai_hub_models.models.yolov7
Yolo-v7-Quantized qai_hub_models.models.yolov7_quantized
Pose Estimation
Facial-Landmark-Detection qai_hub_models.models.facemap_3dmm
Facial-Landmark-Detection-Quantized qai_hub_models.models.facemap_3dmm_quantized
HRNetPose qai_hub_models.models.hrnet_pose
HRNetPoseQuantized qai_hub_models.models.hrnet_pose_quantized
LiteHRNet qai_hub_models.models.litehrnet
MediaPipe-Pose-Estimation qai_hub_models.models.mediapipe_pose
OpenPose qai_hub_models.models.openpose
Posenet-Mobilenet qai_hub_models.models.posenet_mobilenet
Posenet-Mobilenet-Quantized qai_hub_models.models.posenet_mobilenet_quantized
Depth Estimation
Depth-Anything qai_hub_models.models.depth_anything
Depth-Anything-V2 qai_hub_models.models.depth_anything_v2
Midas-V2 qai_hub_models.models.midas
Midas-V2-Quantized qai_hub_models.models.midas_quantized

Audio

Model README
Speech Recognition
HuggingFace-WavLM-Base-Plus qai_hub_models.models.huggingface_wavlm_base_plus
Whisper-Base-En qai_hub_models.models.whisper_base_en
Whisper-Tiny-En qai_hub_models.models.whisper_tiny_en

Multimodal

Model README
OpenAI-Clip qai_hub_models.models.openai_clip
TrOCR qai_hub_models.models.trocr

Generative Ai

Model README
Image Generation
ControlNet qai_hub_models.models.controlnet_quantized
Riffusion qai_hub_models.models.riffusion_quantized
Stable-Diffusion-v1.5 qai_hub_models.models.stable_diffusion_v1_5_quantized
Stable-Diffusion-v2.1 qai_hub_models.models.stable_diffusion_v2_1_quantized
Text Generation
Baichuan2-7B qai_hub_models.models.baichuan2_7b_quantized
IBM-Granite-3B-Code-Instruct qai_hub_models.models.ibm_granite_3b_code_instruct
IndusQ-1.1B qai_hub_models.models.indus_1b_quantized
JAIS-6p7b-Chat qai_hub_models.models.jais_6p7b_chat_quantized
Llama-v2-7B-Chat qai_hub_models.models.llama_v2_7b_chat_quantized
Llama-v3-8B-Chat qai_hub_models.models.llama_v3_8b_chat_quantized
Llama-v3.1-8B-Chat qai_hub_models.models.llama_v3_1_8b_chat_quantized
Llama-v3.2-3B-Chat qai_hub_models.models.llama_v3_2_3b_chat_quantized
Mistral-3B qai_hub_models.models.mistral_3b_quantized
Mistral-7B-Instruct-v0.3 qai_hub_models.models.mistral_7b_instruct_v0_3_quantized
PLaMo-1B qai_hub_models.models.plamo_1b_quantized
Qwen2-7B-Instruct qai_hub_models.models.qwen2_7b_instruct_quantized

Need help?

Slack: https://aihub.qualcomm.com/community/slack

GitHub Issues: https://github.com/quic/ai-hub-models/issues

Email: ai-hub-support@qti.qualcomm.com.

LICENSE

Qualcomm® AI Hub Models is licensed under BSD-3. See the LICENSE file.