/SpeechTransProgress

Tracking the progress in end-to-end speech translation

Creative Commons Zero v1.0 UniversalCC0-1.0

End-to-End Speech Translation Progress

Tutorial

Data

Corpus Direction Target Duration License
CoVoST 2 {Fr, De, Es, Ca, It, Ru, Zh, Pt, Fa, Et, Mn, Nl, Tr, Ar, Sv, Lv, Sl, Ta, Ja, Id, Cy} -> En and En -> {De, Ca, Zh, Fa, Et, Mn, Tr, Ar, Sv, Lv, Sl, Ta, Ja, Id, Cy} Text 2880h CC0
mTEDx {Es, Fr, Pt, It, Ru, El} -> En, {Fr, Pt, It} -> Es, Es -> {Fr, It}, {Es,Fr} -> Pt Text 765h CC BY-NC-ND 4.0
CoVoST {Fr, De, Nl, Ru, Es, It, Tr, Fa, Sv, Mn, Zh} -> En Text 700h CC0
MUST-C & MUST-Cinema En -> {De, Es, Fr, It, Nl, Pt, Ro, Ru, Ar, Cs, Fa, Tr, Vi, Zh} Text 504h CC BY-NC-ND 4.0
How2 En -> Pt Text 300h Youtube & CC BY-SA 4.0
Augmented LibriSpeech En -> Fr Text 236h CC BY 4.0
Europarl-ST {En, Fr, De, Es, It, Pt, Pl, Ro, Nl} -> {En, Fr, De, Es, It, Pt, Pl, Ro, Nl} Text 280h CC BY-NC 4.0
Kosp2e Ko -> En Text 198h Mixed CC
Fisher + Callhome Es -> En Text 160h+20h LDC
MaSS {En, Es, Eu, Fi, Fr, Hu, Ro, Ru} -> {En, Es, Eu, Fi, Fr, Hu, Ro, Ru} Text & Speech 172h Bible.is
LibriVoxDeEn De -> En Text 110h CC BY-NC-SA 4.0
BSTC Zh -> En Text 68h

Toolkit

Paper

2021

2020

2019

2018

2017

2016

2013

Contact

Changhan Wang (wangchanghan@gmail.com)