/INT-248

Primary LanguageJupyter Notebook

INT_246_Project

Image Captioning with Keras and TensorFlow

Image captioning is a new technology that combines LSTM text generation with the computer vision powers of a convolutional neural network. I first saw this technology in Andrej Karpathy's Dissertation. [Cite:karpathy2016connecting] Down Given Figure shows images from his work.

Andrej Karpathy's Dissertation Captioning

In this part, we will use LSTM and CNN to create a basic image captioning system. We will use transfer learning to utilize this proje:

We use inception to extract features from the images. Glove is a set of Natural Language Processing (NLP) vectors for common words. Below Figure gives a high-level overview of captioning.

We begin by importing the needed libraries.

  • For the installation of the required libraries run pip install requirements.txt

Needed Data

You will need to download the following data and place it in a folder for this example. Point the root_captioning string at the folder that you are using for the caption generation. This folder should have the following sub-folders.

Note, the original Flickr datasets are no longer available, but you can download them from a location specified by this article.