/dsa-model

Model base on Kevin Xu Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (https://arxiv.org/pdf/1502.03044.pdf)

Primary LanguagePython

DSA-Model

Model base on Show, Attend and Tell: Neural Image Caption Generation with Visual Attentiont, Soft Attention. Mscoco model base on coldmanck implementation Flickr 8k & 30K base on fuqichen implementation

  • CNN Layer Model: VGG16 (default)
  • RNN Layer Model: LSTM (default)
  • Datasets: MS-COCO, Flickr8k & Flickr30k
  • Scoring: BLEU_1, BLEU_2, BLEU_3, BLEU_4, METEOR, ROUGE_L, CIDEr

Requirements

  • DATA zip file
  • Check each implementation README.md of each dataset

Installation

  • Check each implementation README.md of each dataset

References

  • Kevin Xu, et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention here
  • fuqichen EECS442 Final Project Winter 2019 repo
  • coldmanck Python 3 Version of Show, Attend and Tell using Tensorflow repo
  • Lin, Tsung-Yi Microsoft COCO Caption Evaluation repo