This is the course project of Minghai, Sen and Paul in Multimodal Machine Learning(11-777).