Image_Captioning_CV
Image Captioning, Different approaches using Machine Learning, Deep Learning, Computer Vision and NLP
Our implemented models for "Image Captioning":
- "Old_Model.ipynb" our first trial model, that we could not get results due to high requirement of RAM.
- "Implemented_Model.ipynb" - VGG16 + GRU we got good results with Flickr8k dataset, fast training, but high loss - barely good results
- "Final_Model.ipynb" - ResNet50 + LSTM + Dropout, our own created new model, that trained for 5 epochs, and got good results (refer to report)
Links for the Datasets:
- https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip
- https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip
- https://cocodataset.org/#home
Video Link: https://drive.google.com/drive/folders/1R3R6HiZGtAdhPfkxi2nkdE_T8JfhcIs7?usp=sharing