Lexical Semantic Change Detection (LSCD) for the Russian language by the DeepMistake team.

Lexical semantic change detection

This repository contains code to reproduce the best results from the paper:

Arefyev Nikolay, Maksim Fedoseev, Vitaly Protasov, Daniil Homskiy, Adis Davletov, Alexander Panchenko. "DeepMistake: Which Senses are Hard to Distinguish for a WordinContext Model" in Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2021”.

DeepMistake was 2nd best system in the RuShiftEval-2021 competition.

After the competition we improved the system and outperformed the winner of the competition (see the table below).

Citation

If you use any part of the system, please, cite our paper above.

Clone repositories:

git clone https://github.com/Daniil153/DeepMistake
cd DeepMistake
git clone https://github.com/davletov-aa/mcl-wic

Install requirements

pip install -r mcl-wic/requirements.txt

Download data. You can download data from the command line also:

bash download_files.sh

Download models:

bash download_models.sh first_concat mean_dist_l1ndotn_MSE mean_dist_l1ndotn_CE

To reproduce the best result in evaluation you need use:

bash eval_best_eval_model.sh

To reproduce the best result in post evaluation you need use:

bash eval_best_post-eval_model.sh

To reproduce second the best result in post evaluation you need use:

bash eval_2best_post-eval_model.sh

Results of the LSCD task are presented in the following table. To reproduce them, follow the instructions above to install the correct dependencies.

Model	RuShiftEval avg	RuShiftEval1	RuShiftEval2	RuShiftEval3	Script
first+concat on MCL^en-acc_CE → RSS^{dev2-sentSpear}_MSE, LinReg(https://zenodo.org/record/4981585/files/first_concat.zip)	0.795	0.812	0.78	0.795	eval_best_eval_model.sh
mean+dist_l1ndotn-hs0 on MCL^nen-acc_CE → RSS^{dev2-sentSpear}_MSE, Mean (https://zenodo.org/record/4992633/files/mean_dist_l1ndotn_MSE.zip)	0.833	0.839	0.834	0.826	eval_2best_post-eval_model.sh
mean+dist_l1ndotn-hs0 on MCL^nen-acc_CE → RSS^{dev2-sentSpear}_CE, Mean (https://zenodo.org/record/4992613/files/mean_dist_l1ndotn_CE.zip)	0.85	0.863	0.854	0.834	eval_best_post-eval_model.sh

In the process

Also you can train the best three models with

train_best_eval_model.sh
train_best2_post-eval_model.sh
train_best_post-eval_model.sh