/w2v2-st

Primary LanguagePython

M-Adapter

This code is for Interspeech 2022 paper M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

Environment and data reparation

This codebase is developed upon UPC's repository. Please follow their instructions to set up the environment, preprocess data and download pretrained modules.

Model Training

To train a speech translation model with 3-layer M-Adapter, run the following command.

# Step1
bash train_adapter_2steps.sh step1

# Step2
bash train_adapter_2steps.sh step2

Inference

Run the following command for inference

bash adapt_generate.sh 

Citation

Please cite if you use our code.

@article{zhao2022m,
  title={M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation},
  author={Zhao, Jinming and Yang, Hao and Shareghi, Ehsan and Haffari, Gholamreza},
  journal={arXiv preprint arXiv:2207.00952},
  year={2022}
}