/Image_Captioning_CV

Image Captioning, Different approaches using Machine Learning, Deep Learning, Computer Vision and NLP

Primary LanguageJupyter Notebook

Image_Captioning_CV

Image Captioning, Different approaches using Machine Learning, Deep Learning, Computer Vision and NLP

Our implemented models for "Image Captioning":

  1. "Old_Model.ipynb" our first trial model, that we could not get results due to high requirement of RAM.
  2. "Implemented_Model.ipynb" - VGG16 + GRU we got good results with Flickr8k dataset, fast training, but high loss - barely good results
  3. "Final_Model.ipynb" - ResNet50 + LSTM + Dropout, our own created new model, that trained for 5 epochs, and got good results (refer to report)

Links for the Datasets:

  1. https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip
  2. https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip
  3. https://cocodataset.org/#home

Video Link: https://drive.google.com/drive/folders/1R3R6HiZGtAdhPfkxi2nkdE_T8JfhcIs7?usp=sharing