CCA-images-text

Canonical Correlation Analysis for joint representations of images and text based on [1].

Finds a common representation space for images and tags. Uses the MS COCO dataset, in particularly the captions given for the training data.

main/preprocess.py : computes tags for the training images and computes the features using the VGG16 network
main/pca_cca.py : computes a PCA on the training data and then performs a CCA to find the projection matrices
main/image_to_tags.py : finds the corresponding tags for the images on the validation data
main/tag_to_image.py : finds the corresponding images taking as tags the categories of COCO

[1] Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik. A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics. International Journal of Computer Vision, Volume 106 Issue 2, January 2014, Pages 210-233.

shhn1/CCA-images-text

CCA-images-text