Type-filtered Entity Linker

This is an entity linker using a type filter based on BERT to filter the set of candidate entities and accelerate entity disambiguation.

Installation

  1. Create virtual environment
python -m venv ve
source ./ve/bin/activate
pip install -r requirements.txt
  1. Download spaCy small and transformer models
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_trf
  1. Download Wikidata-Disamb dataset from GitHub and copy it to ./data/wikidata_disamb

  2. Download Wikidata-TypeRec dataset from GitHub and copy it to ./data/wikidata_typerec

  3. Download GloVe

mkdir data/glove
cd data/glove
wget http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip glove.840B.300d.zip
echo "2196017 300" | cat - glove.840B.300d.txt > glove_2.2M.txt
cd ../..
  1. Download PyTorch-BigGraph embeddings
mkdir data/pbg
cd data/pbg
wget https://dl.fbaipublicfiles.com/torchbiggraph/wikidata_translation_v1_names.json.gz
gunzip wikidata_translation_v1_names.json.gz
wget https://dl.fbaipublicfiles.com/torchbiggraph/wikidata_translation_v1_vectors.npy.gz
gunzip wikidata_translation_v1_vectors.json.gz
  1. Train the ED model "BERT+PBG" and copy it to ./data/models/bert_pbg

See https://github.com/samprintz/ed-with-kg-structure#entity-disambiguation-with-knowledge-graph-structure

  1. Train the type classifier TypeRec-BERT and copy it to ./data/models/typerec

See https://github.com/samprintz/type-filtered-entity-linker#type-classifier-typerec-bert

Run Type-Filtered Entity Linker

To execute a test run of the type-filtered entity linker

python -m run

To run the NIF service for the GERBIL evaluation of the type-filtered entity linker

python -m gerbil

To run it for the GERBIL evaluation on the D2KB task

python -m gerbil d2kb

Type Classifier TypeRec-BERT

To train the type classifier TypeRec-BERT

python -m typerec.train

To evaluate it (it might be necessary to update model_name in typerec/test.py)

python -m typerec.test