This work is published at TMLR. [Paper
] [Citation
]
EMMA stands for Extended Multimodal Alignment which is the idea to map the same concepts from different input data sources to the same shared latent space.
This repository contains three parts.
- Code: where the code is located obviously, along with the results.
- Data: where the raw text data and embeddings of all modalities (RGB, depth, text, and speech) are saved.
- Paper: The LaTeX source code for the paper.
Install requierments.
pip install -r requirements.txt
Go to the Code directory first.
cd Code
Make a directory called results:
mkdir results
You have two options to run the code.
- Run the code locally
- Run it on a cluster using slurm
python -u ML.py --arg1_name $arg1_value --arg2_name $arg2_value ...
All arguments have a default value which you can see in the ML.py file, so you can leave arguemnts blank when running the ML.py file, or you can specify any of them you want.
Make a directoy inside the Code/results
directory and name it jobs. you only need to do this step once, and not everytime you run a job on cluster.
mkdir -p results/jobs
Use the following bash command. if you want to run only ONE job on the cluster.
bash run_cluster.sh
If you want to run more than one job run the following command. This can be used when you want to run your code with multiple random seeds or for different values for other hyperparameters.
bash run_cluster_loop.sh
The most simple way would be to give your dataset directory to the --data_dir
argument when running ML.py.
The current setup assumes that the dataset (GoLD) is located in the following directory with respect to the Code directory:
../../../../data/gold/
You can run 4 different methods by specifying their names in the arguments.
- supcon-emma : Our method which we call EMMA and is essentially a combination of SupCon and our Geometric methods.
- full-emma: Our proposed method which uses Geometric alignment to learn concepts.
- supcon: State-of-the-art baseline
- contrastive-org: Another SOTA baseline, but since it's not as strong as SupCon, we only report the results compared to SupCon.
Please use the following bibtex entry to cite this work.
@article{
darvish2023multimodal,
title={Multimodal Language Learning for Object Retrieval in Low Data Regimes in the Face of Missing Modalities},
author={Kasra Darvish and Edward Raff and Francis Ferraro and Cynthia Matuszek},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2023},
url={https://openreview.net/forum?id=cXa6Xdm0v7},
note={}
}