Zero-shot predictions based on the DeViSE - Deep Visual-Semantic Embedding Model by Fromme et al. (2013)
Try the model on AWS or read my blog post!
DeViSE maps images into a rich semantic embedding space to combine the image recognition task with semantic information about the similarity of words/categories learned from language models. The method can make predictions about thousands of image labels not observed during training (zero-shot predictions).
The training notebook is based on fast.ai (part 2, lesson 11) but uses a custom PyTorch model. The model is deployed to AWS using Flask, Swagger UI, and Docker.
Create a conda environment to run the notebook and train the model or to run the app using the flask development server.
Install anaconda, clone the repository, navigate to the folder, and then run:
conda install nb_conda_kernels
conda env create -f env.yml
ipython kernel install --user --name=DeViSE_env
source activate DeViSE_env
Then install fastText:
git clone https://github.com/facebookresearch/fastText.git
cd fastText
pip install .
A model fine-tuned on ImageNet is provided in this repository. If you want to train the model yourself, follow these steps:
Download the needed datasets:
- Full ImageNet
- Or a subset of ImageNet
- FastText word vectors
Run the notebook DeViSE - A Deep Visual-Semantic Embedding Model.ipynb
After running the notebook, you can replace the model I provided with the one you trained by copying the files devise_trained_full_imagenet.pth
and className_2_wordvec_without_dups.pkl
to the folder deviseApi
.
Put a folder/several folders with a selection of pictures, i.e. from the ImageNet validation dataset, in a folder called pictures/
in the folder deviseApi
.
The folder structure should look for example like this:
|-- deviseApi
| |-- pictures
| | |-- <folder_category_1>
| | | |-- <picture1>.JPEG
| | | |-- <picture2>.JPEG
| | |-- <folder_category_2>
| | | |-- <picture1>.JPEG
| | | |-- <picture2>.JPEG
.
.
.
Or like this:
|-- deviseApi
| |-- pictures
| | |-- <folder_all_categories_mixed>
| | | |-- <picture1>.JPEG
| | | |-- <picture2>.JPEG
| | | |-- <picture3>.JPEG
| | | |-- <picture4>.JPEG
.
.
.
Folder and image names within the folder pictures
do not matter.
Activate the virtual environment with source activate DeViSE_env
, navigate to the folder deviseApi
and run python deviseApi.py
. In your browser go to http://localhost:5000/apidocs/
.
Install docker. Navigate to the parent folder and run:
docker build -t devise .
docker run -d -p 8000:8000 devise
In your browser go to http://localhost:8000/apidocs/
.
Stop the running container:
Run docker ps
to get the CONTAINER ID. Then run docker stop <CONTAINER ID>