This repo contains the evaluation scripts needed to replicate the IWSLT 2024 speech translation tasks for Quechua to Spanish.

IWSLT 2024 Dialectal and Low-Resource Speech Translation Task

This is a Python script for evaluating the performance of speech translation systems using the BLEU and chrF metrics. The script takes as input a reference text file and a folder containing the hypothesis text files. It processes each hypothesis file and outputs the results in a tab-separated values (TSV) file.

IWSLT 2024 task homepage

Requirements

  • Python 3.x
  • pandas
  • bleu_scorer
  • chrF_scorer

Installation

  1. Clone the repository
  2. Navigate to the repository directory
  3. Install the dependencies
git clone https://github.com/Llamacha/iwslt24_que_esp
cd iwslt24_que_esp
pip install -r requirements.txt

Usage

  1. Ensure that your reference file and hypothesis files are named correctly.
  2. Open a terminal and navigate to the repository directory.
  3. Run the script with the following command:
python main.py --ref /path/to/reference/file --phyp /path/to/hypotheses/folder
  1. Wait for the script to finish processing all the hypothesis files.
  2. Find the results in a TSV file named results.tsv in the hypothesis folder.

Reference File

The reference file is a plain text file containing the ground truth translations for each source sentence. Each line in the file represents one sentence, and each sentence should be separated by a newline character.

Hypotheses Folder

The hypotheses folder should contain one or more plain text files, each containing the translations generated by a speech translation system. Each file should be named in the following format: {team_id}.st.{condition}.primary.que-spa.txt. Here, {team_id} is a unique identifier for the team that generated the translations, {condition} is either constrained or unconstrained, and primary is the name of the translation type.

Results

The output file results.tsv contains the following columns:

  • Participant: the unique identifier for each team that generated the translations.
  • Condition: the condition under which the translations were generated (constrained or unconstrained).
  • Type: the name of the translation type (primary, contrastive1, or contrastive2).
  • BLEU: the BLEU score for each set of translations.
  • chrF: the chrF score for each set of translations.

License

This project is licensed under the MIT License - see the LICENSE file for details.