Details of interesting multimodal architecture for vision and language
No issues in this repository yet.