/VGAMT

Primary LanguageSmalltalk

Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation (VGAMT)

Read the paper (arXiv)

New! VGAMT trained models can be downloaded here: download link

One of the major challenges of machine translation (MT) is ambiguity, which can in some cases be resolved by accompanying context such as an image. However, recent work in multimodal MT (MMT) has shown that obtaining improvements from images is challenging, limited not only by the difficulty of building effective cross-modal representations but also by the lack of specific evaluation and training data. We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism and which is jointly trained on both visual masking and MMT. We also release CoMMuTE, a Contrastive Multilingual Multimodal Translation Evaluation dataset, composed of ambiguous sentences and their possible translations, accompanied by disambiguating images corresponding to each translation. Our approach obtains competitive results over strong text-only models on standard English-to-French benchmarks and outperforms these baselines and state-of-the-art MMT systems with a large margin on our contrastive test set.

If you use our codebase, please cite:

@inproceedings{futeral-etal-2023-tackling,
    title = "Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation",
    author = "Futeral, Matthieu  and
      Schmid, Cordelia  and
      Laptev, Ivan  and
      Sagot, Beno{\^\i}t  and
      Bawden, Rachel",
    editor = "Rogers, Anna  and
      Boyd-Graber, Jordan  and
      Okazaki, Naoaki",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.295",
    doi = "10.18653/v1/2023.acl-long.295",
    pages = "5394--5413"
}

Clone repository with submodules

git clone --recurse-submodules https://github.com/MatthieuFP/VGAMT.git

Data preparation

In this work, we exploit OPUS text-only, Multi30k multilingual text-image and Conceptual Caption English text-image data. To download and extract the features we use in our work, please follow the instructions here.

Training

Create a conda environment from the requirements.txt file. This work was conducted using SLURM job scheduler. Please adapt the scripts to your local configuration.

Install adapter-transformers

cd adapter-transformers
pip install .

For all experiments, please fill in the following variables:

  • CACHE_HUGGINGFACE
  • DATA_PATH
  • DUMP_PATH
  • FEAT_PATH (if MMT experiment)
  • EXP_NAME
  • seed

Text-only Machine Translation model

You need a strong text-only MT model before training VGAMT, please run the following command lines:

source activate vgamt

echo "NODELIST="${SLURM_NODELIST}
echo "JOB_NODELIST="${SLURM_JOB_NODELIST}
master_addr=$(scontrol show hostnames "$SLURM_JOB_NODELIST" | head -n 1)
export MASTER_ADDR=$master_addr
echo "MASTER_ADDR="$MASTER_ADDR

srun ./scripts/training/train_MT_from_MBART.sh 

VGAMT

To train VGAMT from a strong MT model, please inform the additional variables:

  • DATA_MIX_PATH (if using VMLM objective)
  • FEAT_PATH_MIX (if using VMLM objective)
  • MT_MODEL_PATH
source activate vgamt

echo "NODELIST="${SLURM_NODELIST}
echo "JOB_NODELIST="${SLURM_JOB_NODELIST}
master_addr=$(scontrol show hostnames "$SLURM_JOB_NODELIST" | head -n 1)
export MASTER_ADDR=$master_addr
echo "MASTER_ADDR="$MASTER_ADDR

srun ./scripts/training/finetune_mix_MMT_VMLM_from_MT.sh

Evaluation

  • BLEU scores:

Please, first inform MODEL_PATH variable and run ./scripts/eval/eval_mmt_bleu.sh to compute BLEU scores and translation generation.

  • METEOR scores:

Inform METEOR_FILE, REFERENCE_PATH, HYPOTHESIS_PATH and TGT_LANG variables. To install meteor, please have a look here. Then, run ./scripts/eval/eval_meteor.sh

  • COMET scores:

Inform REFERENCE_SRC_LG, REFERENCE_TGT_LG, HYPOTHESIS_TGT_LG, PATH_TO_COMET_STORAGE. To install comet, please have a look here. In our work, we use the wmt20-comet-da model. Then, run ./scripts/eval/eval_comet.sh

  • CoMMuTE ranking accuracy:

To compute CoMMuTE accuracy for your model, you can run ./scripts/eval/eval_mmt_commute.sh after having filled in the variables described in VGAMT section.