/ImageCaptioning

This is an image captioning deep learning model, which returns a single line description of the image fed to it.

Primary LanguageJupyter Notebook

About The Project

This is an image captioning deep learning model, which returns a single line description of the image fed to it.

Built With

  • Encoder-Decoder architecture.
  • Transfer Learning
  • Beam Search.
  • Flicker8k dataset, used this dataset as was most feasible due to its smaller size comparing to COCOMO dataset.

The loss value of 4.8987 has been achieved which gives okayish results. Everything is implemented in the Jupyter notebook which will hopefully make it easier to understand the code.

Dependencies

  • Keras 1.2.2
  • Tensorflow 0.12.1
  • tqdm
  • numpy
  • pandas
  • matplotlib
  • pickle
  • PIL
  • glob