This repository contains code and resources for multimodal research, which explores the fusion and analysis of multiple modalities, such as text, images, and audio, to solve various tasks.
avi326/MultiModal_Research
Research on multimodal, focusing on image captioning, Visual Question Answering (VQA), and Chatbots.
Jupyter Notebook