Implementation of CNN-RNN architecture for image caption generation proposed in this paper.
Google AI Blog about this problem.
git clone https://github.com/mmilunovic/a-picture-is-a-thousand-words.git
pip install -r requirements.txt
apply_model_to_image_raw_bytes(open("test-image.jpg", "rb").read())
If you want to train the model by yourself, you'll need to download training and validation datasets and place them in the train_data and test_data directories:
- Show And Tell Paper - Original paper
- Advanced Machine Learning Course - Final project for this course