/image-captioning

Comparitive analysis of image captioning model using RNN, BiLSTM and Transformer model architectures on the Flickr8K dataset and InceptionV3 for image feature extraction.

Primary LanguagePython

Comparitive analysis of RNN's, BiLSTMS's and Transformers in image captioning.

This repository contains the code and documentation for the team project completed for the course CS5100 at Northeastern University. Title : Pictophrases - Teaching computers to describe images.
Members : Darshan Rajopadhye, Kevin Heleodoro, Poornima Jaykumar Dharamdasani, Saumya Gupta

Code Base :

We developed 3 custom model architectures to compare the types of models in the task of image captioning.
Each folder RNN, BiLSTM, Transformer in the repository consists the source code for training, testing and evaluating the models respectively.

Documentation :

The documentation folder consists of the Proposal, Abstract, Presentation and the final project Report submitted for fulfilment of the project.

Contributions

Contributions to this project are welcome. Please feel free to fork the repository, make your changes, and create a pull request.