/dict-definition

Preprocessing scripts to read definitions and other information from dictionaries

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

dict-definition

Preprocessing scripts to read definitions and other information from dictionaries. This repository is for AAAI2017 paper: "Definition Modeling: Learning to define word embeddings in natural language".

Dependencies

Data

  • Wordnik provides an API to get word definitions and other information from multiple dictionaries. You will need an API Key to access (see Developer site).
  • GCIDE, GNU Collaborative International Dictionary of English, contains entries mostly from Webster. This project use a pre-processed version of the original release which can be found here.
  • WordNet contains about 150,000 words and phrases. This project uses NLTK to read data from WordNet.
  • HillF_TACL2016 provides more than 800k definitions from WordNik API along with word embeddings. This data accompany this paper.

For detail of the data, see Data