/Hierarchical-Word-Sense-Disambiguation-using-WordNet-Senses

Word Sense Disambiguation using Word Specific models, All word models and Hierarchical models in Tensorflow

Primary LanguageJupyter NotebookMIT LicenseMIT

Word Sense Disambiguation

Word sense disambiguation (WSD) is the ability to identify the meaning of words in context. We address this problem using series of end-to-end neural architectures using bidirectional Long Short Term Memory (LSTM). We propose two variants for WSD: an end-to-end word specific neural model and all-words neural model. In the word specific models we have to train models for every disambiguation target word. We addressed this issue using the all-words model which rely on sequence learning. We also used POS tags to improve the performance. We tried different variants of attention mechanisms for the all-words model. Performance was boosted by using convolutional neural networks (CNN) which captures local features around the words that is normally what humans do for predicting the senses. We further improved the performance using hierarchical models. We used POS tags as hierarchy and used two variants as soft masking and hard masking.

Methods

Best Models

Details

For detailed information about models and results:

All words Models

Word Specific Models

Files with name as Model-1-multigpu-1.ipynb are the basic models

Files with name as Model-1-multigpu-2.ipynb are the basic models

Files with name as Model-1-multigpu-3.ipynb are the basic models

Files with name as Model-1-multigpu-4.ipynb are the basic models