KD-DocRE

Implementation of Document-level Relation Extraction with Knowledge Distillation and Adaptive Focal Loss - Findings of ACL 2022

Required Packages

Python (tested on 3.7.4)
CUDA (tested on 10.2)
PyTorch (tested on 1.10.2)
Transformers (tested on 4.8.2)
numpy (tested on 1.19.4)
apex (tested on 0.1)
opt-einsum (tested on 3.3.0)
axial-attention (tested on 0.6.1)
ujson
tqdm

Dataset

The DocRED dataset can be downloaded following the instructions at [link]

root
 |-- dataset
 |    |-- docred
 |    |    |-- train_annotated.json        
 |    |    |-- train_distant.json
 |    |    |-- dev.json
 |    |    |-- test.json
 |    |    |-- wikidata-properties.csv
 

 |-- meta
 |    |-- rel2id.json

Training and Evaluation

DocRED

Train the BERT model on DocRED with the following command:

Step 1: Training Teacher Model

>> bash scripts/batch_roberta.sh  # for RoBERTa

Step 2: Inference logits for the distantly supervised data

>> bash scripts/inference_logits_roberta.sh

Step 3: Pre-train the student model

>> bash scripts/knowledge_distill_roberta.sh

Step 4: Continue fine-tuning on the human annotated dataset.

>> bash scripts/continue_roberta.sh

The program will generate a test file --output_name in the official evaluation format. You can compress and submit it to Codalab for the official test score.

Evaluating Models

Our pre-trained models at each stage can be found at: https://drive.google.com/drive/folders/1Qia0lDXykU4WPoR16eUtEVeUFiTgEAjQ?usp=sharing You can download the models and make use of the weights for inference/training.

Evaluating the trained models.

>> bash scripts/eval_roberta.sh

Part of the code is adapted from ATLOP: https://github.com/wzhouad/ATLOP.

Citation

If you find our work useful, please cite our work as:

@inproceedings{tan-etal-2022-document,
    title = "Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation",
    author = "Tan, Qingyu  and
      He, Ruidan  and
      Bing, Lidong  and
      Ng, Hwee Tou",
    booktitle = "Findings of ACL",
    year = "2022",
    url = "https://aclanthology.org/2022.findings-acl.132",


}