Flickr8K-CN is a bilingual (English-to-Chinese) extension of the popular Flickr8K set, used for evaluating image captioning in a cross-lingual setting.
Chinese sentences | Flickr8k-train | Flickr8k-val | Flickr8k-test |
---|---|---|---|
human written | ✅ | ✅ | ✅ |
human translation | ❌ | ❌ | ✅ |
machine translation (baidu) | ✅ | ✅ | ✅ |
machine translation (google) | ✅ | ✅ | ✅ |
- Original English sentences
- Chinese sentences written by native Chinese speakers
- Chinese sentences generated by Baidu translation (icmr2016 version, version 20160815)
- Chinese sentences generated by Google translation (icmr2016 version, version 20160816)
- Chinese sentences generated by human translation (only the test set is covered)
- imageids of 6K training images, 1k validation images, 1k test images
- 1,024-dim GoogleNet pool5, read by bigfile.py
- Xirong Li, Weiyu Lan, Jianfeng Dong, Hailong Liu, Adding Chinese Captions to Images, ACM ICMR 2016