Image-caption Using End2end CNN , LSTM based Model!

📖 Documentation

Install the requirements:

pip3 install -r requirements.txt

Running the Model

python3 model.py

The results are not bad at all! a lot of test cases gonna be so realistic, but the model still needs more training

This project is an implementation of the Show and Tell, published 2015.

Dataset used is Flicker8k each image have 5 captions.
you can request data from here [Flicker8k] (https://forms.illinois.edu/sec/1713398).
Sample of the data used

-Training, Training and more Training
-Using Resnet instead of VGG16
-Creating API for production level
-Using Word2Vec embedding.