AniketDogra/Caption-Generator

LSTM model generates captions for images after extracting features using a CNN model.

Jupyter NotebookMIT

Image Caption Generator

Image caption generator is a model which generates a caption that describes the contents of the image. It requires a model from computer vision to understand the content of the image and a language model from NLP to translate the understanding of the image into words.

Deep Learning Models have provided an excellent way to get results for examples of this problem.

Dataset:

FLICKR_8K. This dataset includes around images along with 5 different captions written by different people for each image.

The Block diagram of model used in the project

Flow of the Project

1. Cleaning the Captions

2. Extracting Features of image

3. Preprocessing of Image and Text Data

4. Training on LSTM Model

5. Predicting Captions and Evaluating performance

VGG Model Summary

Pre trained VGG16 Model has been used to extract the features of the image.

LSTM Model Summary

Predictions by the model

Evaluation using BLEU score

Good Captions

Bad Captions

Conclusion

The Model has successfully generated captions for images. The performance of the model can be further improved by hyperparameter training.