SemEval 2024 Task 1: Semantic Textual Relatedness

This repository contains the data and resources for the SemEval 2024 Task 1: Semantic Textual Relatedness (STR). For more information, please visit the shared task and competition websites.

Dataset | Languages | Shared Task Starter Kit | Citing This Work

Dataset

The STR dataset is available in the data folder or can be downloaded from Hugging Face (coming soon).

For Track A: TrackA folder
For Track B: TrackB folder
For Track C: TrackC folder

Languages

The STR task focuses on the following 14 languages:

Afrikaans (afr released)
Algerian Arabic (arq released)
Amharic (amh released)
English (eng released)
Hausa (hau released)
Indonesian
Hindi
Kinyarwanda
Marathi (mar released)
Modern Standard Arabic (arb released)
Moroccan Arabic (ary released)
Punjabi
Spanish (esp released)
Telugu (tel released)

Shared Task Starter Kit

A starter kit is available to help you create a baseline result. You can open the starter kit in a Colab Notebook and run the baseline system. The resultant experiment can be submitted to Codalab to ensure the submission format is clear.

To run the Colab Notebook, click the badge "Open in Colab".

Simple Co-occurrence Baseline for Semantic Relatedness:

Citing This Work

If you use our dataset or participate in the STR task, please cite the following papers:

STR dataset paper: coming soon
STR SemEval task description paper: coming soon

TamaraAtanasoska/Semantic_Relatedness_SemEval2024

SemEval 2024 Task 1: Semantic Textual Relatedness

Dataset

Languages

Shared Task Starter Kit

Citing This Work