Word-Analogy-Bangla

This repository contains the dataset and code for our EMNLP 2023 paper "On Evaluation of Bangla Word Analogies".

We provide a Mikolov-style word-analogy evaluation set specifically for Bangla, with a sample size of 16678, as well as a translated and curated version of the Mikolov dataset, which contains 10594 samples for cross-lingual research.

Dataset can be found here

Paper Link

Data

Code

Go to Code Folder. Then,

For accuracy check: .\accuracy.sh

For top-k words: .\run.sh

Download Model folder from Here

Cite

@inproceedings{akter-etal-2023-evaluation,
    title = "On Evaluation of {B}angla Word Analogies",
    author = "Akter, Mousumi  and
      Sarkar, Souvika  and
      Karmaker Santu, Shubhra Kanti",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.811",
    doi = "10.18653/v1/2023.emnlp-main.811",
    pages = "13121--13127"
}