This repo contains the evaluation scripts needed to replicate the IWSLT 2024 speech translation tasks for Quechua to Spanish.

IWSLT 2024 Dialectal and Low-Resource Speech Translation Task

This is a Python script for evaluating the performance of speech translation systems using the BLEU and chrF metrics. The script takes as input a reference text file and a folder containing the hypothesis text files. It processes each hypothesis file and outputs the results in a tab-separated values (TSV) file.

IWSLT 2024 task homepage

Requirements

Python 3.x
pandas
bleu_scorer
chrF_scorer

Installation

Clone the repository
Navigate to the repository directory
Install the dependencies

git clone https://github.com/Llamacha/iwslt24_que_esp
cd iwslt24_que_esp
pip install -r requirements.txt

Usage

Ensure that your reference file and hypothesis files are named correctly.
Open a terminal and navigate to the repository directory.
Run the script with the following command:

python main.py --ref /path/to/reference/file --phyp /path/to/hypotheses/folder

Wait for the script to finish processing all the hypothesis files.
Find the results in a TSV file named results.tsv in the hypothesis folder.

Reference File

The reference file is a plain text file containing the ground truth translations for each source sentence. Each line in the file represents one sentence, and each sentence should be separated by a newline character.

Hypotheses Folder

The hypotheses folder should contain one or more plain text files, each containing the translations generated by a speech translation system. Each file should be named in the following format: {team_id}.st.{condition}.primary.que-spa.txt. Here, {team_id} is a unique identifier for the team that generated the translations, {condition} is either constrained or unconstrained, and primary is the name of the translation type.

Results

The output file results.tsv contains the following columns:

Participant: the unique identifier for each team that generated the translations.
Condition: the condition under which the translations were generated (constrained or unconstrained).
Type: the name of the translation type (primary, contrastive1, or contrastive2).
BLEU: the BLEU score for each set of translations.
chrF: the chrF score for each set of translations.

License

This project is licensed under the MIT License - see the LICENSE file for details.