Lexical semantic change detection
This repository contains code to reproduce the best results from the paper:
Arefyev Nikolay, Maksim Fedoseev, Vitaly Protasov, Daniil Homskiy, Adis Davletov, Alexander Panchenko. "DeepMistake: Which Senses are Hard to Distinguish for a WordinContext Model" in Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2021”.
DeepMistake was 2nd best system in the RuShiftEval-2021 competition.
After the competition we improved the system and outperformed the winner of the competition (see the table below).
If you use any part of the system, please, cite our paper above.
Clone repositories:
git clone https://github.com/Daniil153/DeepMistake
cd DeepMistake
git clone https://github.com/davletov-aa/mcl-wic
Install requirements
pip install -r mcl-wic/requirements.txt
Download data. You can download data from the command line also:
bash download_files.sh
Download models:
bash download_models.sh first_concat mean_dist_l1ndotn_MSE mean_dist_l1ndotn_CE
To reproduce the best result in evaluation you need use:
bash eval_best_eval_model.sh
To reproduce the best result in post evaluation you need use:
bash eval_best_post-eval_model.sh
To reproduce second the best result in post evaluation you need use:
bash eval_2best_post-eval_model.sh
Results of the LSCD task are presented in the following table. To reproduce them, follow the instructions above to install the correct dependencies.
Model | RuShiftEval avg | RuShiftEval1 | RuShiftEval2 | RuShiftEval3 | Script |
---|---|---|---|---|---|
first+concat on MCLen-accCE → RSSdev2-sentSpearMSE, LinReg(https://zenodo.org/record/4981585/files/first_concat.zip) | 0.795 | 0.812 | 0.78 | 0.795 | eval_best_eval_model.sh |
mean+dist_l1ndotn-hs0 on MCLnen-accCE → RSSdev2-sentSpearMSE, Mean (https://zenodo.org/record/4992633/files/mean_dist_l1ndotn_MSE.zip) | 0.833 | 0.839 | 0.834 | 0.826 | eval_2best_post-eval_model.sh |
mean+dist_l1ndotn-hs0 on MCLnen-accCE → RSSdev2-sentSpearCE, Mean (https://zenodo.org/record/4992613/files/mean_dist_l1ndotn_CE.zip) | 0.85 | 0.863 | 0.854 | 0.834 | eval_best_post-eval_model.sh |
In the process
Also you can train the best three models with
train_best_eval_model.sh
train_best2_post-eval_model.sh
train_best_post-eval_model.sh