
Using OpenVINO to run the image captioning on C++ and python.

Primary LanguageC++MIT LicenseMIT

Image Captioning

This topic demonstrates how to run the Image Caption sample application, which performs inference using image caption networks.


The topology of this sample was forked from pytorch-tutorial which wrote by yunjey.

How It Works

Upon the start-up, the sample application reads command line parameters and loads a network and an image to the Inference Engine plugin. When inference is done, the application creates an caption for input image.


Download the converted encoder and decoder model from here and save it on path ${PROJECT_ROOT}/models/.



Using flags to assign the specific parameter:

-m_d            ->  assign the path of decoder model
-m_e            ->  assign the path of encoder model
-t_l            ->  assign the length of output text
-i              ->  assign the path of input image
--cpu_extension ->  assign the path of cpu extension library

Following the below commend to run the sample :

python infer.py -m_d ../models/decoder_nightly.xml -m_e ../models/encoder.xml -i ../images/example.png -t_l 20 --cpu_extension ${PATH_OF_CPU_EXTENSION_LIBRARY}


mkdir build
cv build
cmake ..
make -j16

Using flags to assign the specific parameter:

-m_d            ->  assign the path of decoder model
-m_e            ->  assign the path of encoder model
-tl             ->  assign the length of output text
-i              ->  assign the path of input image

Following the below commend to run the sample :

./image_caption -m_d ../../models/decoder_nightly.xml -m_e ../../models/encoder.xml -i ../../images/example.png -t_l 20 


Inference image:

alt text

The caption for inference image: [ a group of giraffes standing next to each other . ]