MT-Evaluation

Machine Translation (MT) Evaluation Scripts

Installation

All dependencies can be installed via:

pip3 install -r requirements.txt

To run the Python scripts and calculate the MT evaluation metrics on your machine translation output, you need to have two files:

Reference: It is the human translation (target) file of your test dataset.
System: It is the MTed translation/prediction, generated by the machine translation model for the source of the same test dataset used for “Reference”.

Corpus BLEU: Calculates the BLEU score for the whole corpus and prints the result.

python3 compute-bleu.py Reference.txt System.txt

Sentence BLEU: Calculates the BLEU score for sentence by sentence and saves the result to a file.

python3 compute-bleu-sentence.py Reference.txt System.txt

Sentence METEOR: Note that METEOR works on the sentence level only.

python3 sentence-meteor.py Reference.txt System.txt

Corpus WER: Calculates the WER score for the whole corpus and prints the result.

python3 corpus-wer.py Reference.txt System.txt

Sentence WER: Calculate the WER score for sentence by sentence and saves the result to a file.

python3 sentence-wer.py Reference.txt System.txt

If you have questions or suggestions, please feel free to contact me.