
Use a CNN+LSTM based model architecture to generate captions that aptly describe images

Image captioning can have following applications:

  • Self driving cars : caption the scene around the car
  • Aid to the blind : guide them travelling on the roads
  • CCTV cameras : raise alarm against malicious activities
  • Google Image Search

Dataset Used:

For this project Flickr8k dataset is used (Flickr8K kaggle) This dataset contains 8000 images each with 5 captions. Along with images, you will also get some text files related to the images. One of the files is “Flickr8k.token.txt” which contains the name of each image along with its 5 captions.
