This is a Flask web application that generates captions for uploaded images using a pre-trained deep learning model. The model uses a convolutional neural network (CNN) to extract features from the image, and then uses a recurrent neural network (RNN) with long short-term memory (LSTM) cells to generate the caption.
The pre-trained model was trained on the Flickr8k dataset, which consists of 8,000 images with 5 captions each. The model was trained for 30 epochs, achieving a validation loss of 2.6.
- Python 3.x
- Flask
- TensorFlow
- Keras
- NumPy
- Pillow (PIL)
- Clone this repository or download the files.
- Install the required packages using pip:
pip install flask tensorflow keras numpy pillow
- Download the pre-trained model from this Google Drive link and save it in the
models
folder. - Run the application with the following command:
python app.py
- Open a web browser and navigate to
http://localhost:5000
A live demo of this application is available at https://image-caption-generator-app.herokuapp.com/.
This code is released under the MIT License. See the LICENSE
file for more information.
This application was inspired by the following resources:
- Show and Tell: A Neural Image Caption Generator by Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan.
- Image Caption Generator with Keras by Jason Brownlee.
- Flask Web Development with Python Tutorial - Full Course by Tech With Tim.