This is the code of KnowBERT, re-writed by Huggingface. We only rewrite the code of pretraining wiki linker, where knowbert_wiki_linker.jsonnet
is used as configs.
It contains the data processing part and the KnowBERT model part, you can run each file (data_utils.py
or modeling.py
) for better understanding. Note that, we didn't implement the running part.
1.1 preparing data (Downloading from
knowbert_wiki_linker.jsonnet
).aida_train.txt
,aida_dev.txt
is used for training or evaluating the pre-trained entity linker.prob_yago_crosswikis_wikipedia_p_e_m.txt
, andwiki_id_to_string.json
2.1 Allennlp document
2.2 An example
2.3 The framework of Allennlp
3.1 How is a class instantiated and how are parameters passed from
XXX.jsonnet
file?
A class is instantiated according to the decorator, the parameters are passed by a default function_from_parameters
, refer to Allennlp doc for details.
3.2 How to execute the functions in class?
After instantiating Trainer, executingTrainer.train()
is going to execute the model (e.g., loading data, training model). It
3.3 How to read training dataset?
training file -> XXXReader->text_to_read->Instances->_read->Iterable object->batch->model.
see
sketches.pdf
.