Organize based on Paul Liang's repo: Reading List for Topics in Multimodal Machine Learning, any suggestions are welcome!
- Survey Papers
- Core Areas
- Representation Learning
- Multimodal Fusion
- Multimodal Alignment
- Multimodal Translation
- Missing or Imperfect Modalities
- Knowledge Graphs and Knowledge Bases
- Intepretable Learning
- Generative Learning
- Semi-supervised Learning
- Self-supervised Learning
- Language Models
- Adversarial Attacks
- Few-Shot Learning
- Applications
- Language and Visual QA
- Language Grounding in Vision
- Language Grouding in Navigation
- Multimodal Machine Translation
- Multi-agent Communication
- Commonsense Reasoning
- Multimodal Reinforcement Learning
- Multimodal Dialog
- Language and Audio
- Audio and Visual
- Media Description
- Video Generation from Text
- Affect Recognition and Multimodal Language
- Healthcare
- Robotics
- Workshops
- Tutorials
- Courses
- CMU --- MultiComp Lab
- MIT --- SYNTHETIC INTELLIGENCE LABORATORY
- NTU --- SenticNet Team
- SenticNet GitHub
- MultiMT
- CMU MultimodalSDK --- Affect Recognition and Multimodal Language
- AMHUSE --- Affect Recognition and Multimodal Language
- Multi30k Dataset --- Multimodal Machine Translation
- VATEX --- Multimodal Machine Translation
- MELD --- Multimodal Dialog
- CLEVR-Dialog --- Multimodal Dialog
- Charades-Ego --- Media Description
- MPII --- Media Description
- RecipeQA --- Language and Visual QA
- GQA --- Language and Visual QA
- CLEVR --- Language and Visual QA