The aim of the project is to generate a caption for images.
Each image has a story, Image Captioning narrates it.
this model is bases on Show and Tell: A Neural Image Caption Generator
Install the requirements:
pip3 install -r requirements.txt
Running the Model
python3 model.py
The results are not bad at all! a lot of test cases gonna be so realistic, but the model still needs more training
This project is an implementation of the Show and Tell, published 2015.
- Dataset used is Flicker8k each image have 5 captions.
- you can request data from here [Flicker8k]
(https://forms.illinois.edu/sec/1713398).
Sample of the data used
-Training, Training and more Training
-Using Resnet instead of VGG16
-Creating API for production level
-Using Word2Vec embedding.