/KALComp

A comparable corpus of Kalaallisut and Danish web-crawled sentences, along with some noisy aligned texts and code for MT finetuning experiments between Kalaallisut and English. Currently looking to improve the quality of pseudoparallel data. Final project for LING28/Computational Linguistics, Dartmouth College, Winter 2022.

Primary LanguageJupyter Notebook

Stargazers