This is a repository for MultiModal Europarl Corpus.
NB. This is a git submodule of mmep-corpus-process
.
audio/
: empty/git placeholder --place to symlink audio files when working with themmetadata/
various kinds of metadatatranscribed-audio/
: subdirs of elan projects- pfsx files in the gitignore!
written-records/
: written versions and translations of Europarl- TODO: how to organize this in the best way?
- keep data conceptually and practically distinct from code that curates data and issues relating to analyses
- allows reuse of data without necessarily relying on the ecosystem developed in the mama-corpus --
mmep-corpus-process