/semstyle

Code for learning to generate stylized image captions from unaligned text

Primary LanguagePython

SemStyle

Code and models for learning to generate stylized image captions from unaligned text.

A full list of semstyle results on MSCOCO images can be found here.

A pytorch rewrite of SemStyle is included in "./code".

  • This software is written in python 2.7 it assumes that you have installed (or a conda environment containing) torch, torchvision, scipy

  • To setup the semstyle models go to "./code/models/" and run "download.sh". Then from "./code" run: python img_to_text.py --test_folder <folder with your test images>

  • Training code is included.

  • Scripts to generate the training data from publicly available sources is coming.

Online demo

For a limited time (while cpu cycles last): Live demo, caption your own images

A blog post decribing the SemStyle system is here: http://cm.cecs.anu.edu.au/post/semstyle/

Citing this work:

Alexander Mathews, Lexing Xie, Xuming He. SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text, in Conference on Computer Vision and Pattern Recognition (CVPR ‘18), Salt Lake City, USA, 2018. https://arxiv.org/abs/1805.07030

@inproceedings{mathews2018semstyle,
  title={{SemStyle}:  Learning to Generate Stylised Image Captions using Unaligned Text},
  author={Mathews, Alexander and Xie, Lexing and He, Xuming},
  booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2018}
}