Image-Captioning-Bot

About

Image Captioning Bot, as the name suggests will predict the caption for any image provided to it on the basis of the trained dataset. Generate word after another, uses Deep Learning to generate words.

Tech Stack Used

Numpy
Pandas
cv2
Matplotlib
Keras
Re
NLTK
string
json
time
pickle
Tensorflow
collections
Resnet50

Model Structure

Resnet50 pre-trained model is used on images first to generate 2048 output size which will be further feed to created model

(Image)                                (NLP)
Input (2048)                          Input (30)
  |                                      |
  |                                      |
Dropout (0.3)                        Embedding (input_dim=2574, ouput_dim=50)
  |                                      |
  |                                      |
Dense (256)                           Dropout (0.3)
  |                                      |
  |                                      |
  |                                     LSTM (256)
  |________________       _______________|
                   |     |
                   |     |
                   Add (256 x 2)
                     |
                     |
                   Dense (256)
                     |
                     |
                   Dense (2574)
                  (output)

Examples

Caption: two dogs are running on beach

External Links