This repository contains the dataset and code for our EMNLP 2023 paper "On Evaluation of Bangla Word Analogies".
We provide a Mikolov-style word-analogy evaluation set specifically for Bangla, with a sample size of 16678, as well as a translated and curated version of the Mikolov dataset, which contains 10594 samples for cross-lingual research.
Dataset can be found here
Go to Code Folder. Then,
For accuracy check: .\accuracy.sh
For top-k words: .\run.sh
Download Model folder from Here
@inproceedings{akter-etal-2023-evaluation,
title = "On Evaluation of {B}angla Word Analogies",
author = "Akter, Mousumi and
Sarkar, Souvika and
Karmaker Santu, Shubhra Kanti",
editor = "Bouamor, Houda and
Pino, Juan and
Bali, Kalika",
booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.emnlp-main.811",
doi = "10.18653/v1/2023.emnlp-main.811",
pages = "13121--13127"
}