Image caption Generator

Description

In this project, we have executed a systematic and strategic approach to develop a deep learning model that is capable of generating image captions. It is a fact that an image is worth a thousand words since it carries some information as a part of it. But in several situations the task of expressing and communicating information falls short by only displaying a mere photo; therefore we studied and concluded that some precise and relevant written descriptions along the photos might be useful and add appropriate meaning to them. This gave the inspiration and motif to create the efficient and useful mechanism of generation of the “captions” for images/photos. Hence we developed our deep learning project using sophisticated algorithms and models of deep learning and neural networks which can create a caption of an image that is fed into our model.Our project majorly consists of two parts; the first is the deep learning part which deals with the understanding of an image that is fed into our model. It also handles the preprocessing of the dataset, cleaning it and preparing it to be fed to the training model. Second is the part of language understanding and processing. This project deals with the operations and applications of natural language processing since it has a major focus on the text and understanding of language. It also deals with relating the meaningful description with an image, not some random text with any random image. The dataset used is the Flickr8K dataset for our model training which consists of 8k images that are fed into our model. We loaded the input images to train the model that we developed. This model is trained efficiently to generate and give the captions of the images as a resultant output.

Installation

Training the model

To train the model execute the Feature_extraction_and_training file. The weights are stored in the DL_model.h5 and the model-ep017-loss2.641-val_loss4.644 file.

Model Evaluation and Caption Generation

To evaluate the model execute and generate captions execute the Caption_generation.ipynb file. The needed weight and pickle files exist in the repository.

Output

op