/Automated-Image-Captioning-with-Visual-Attention

An python implementation of a Automated Image Captioning with Visual Attention. The Image Captioning is done through an encoder-decoder architecture where the encoder is a deep CNN and the decoder is a RNN unit. A time based visual attention as described in neural machine translation systems is used by the RNN to produce very realistic captions for the images. This work is inspired by the paper " Show, Attend and Tell: Neural Image CaptionGeneration with Visual Attention" by Xu et. al.

Primary LanguagePython

No issues in this repository yet.