Codes for paper "Adaptive Hinge Balance Loss for Document-Level Relation Extraction", Findings of EMNLP 2023.
- Python (tested on 3.7.4)
- CUDA (tested on 11.3)
- PyTorch (tested on 1.12.1)
- Transformers (tested on 4.20.1)
- numpy (tested on 1.21.6)
- spacy (tested on 3.3.3)
- opt-einsum (tested on 3.3.0)
- ujson
- tqdm
The Re-DocRED dataset can be downloaded following the instructions at link.
The expected structure of files is:
HingeABL
|-- dataset
| |-- docred
| | |-- train_revised.json
| | |-- dev_revised.json
| | |-- test_revised.json
| | |-- train_annotated.json
|-- meta
| |-- rel2id.json
|-- scripts
|-- checkpoint
|-- log
Note: train_annotated.json, rel2id.json can be obtained through the DocRED dataset, which can be downloaded following the instructions at link.
Train the BERT model with HingeABL using the following command:
>> sh scripts/train_HingeABL.sh # training
>> sh scripts/test_HingeABL.sh # evaluation
You can select different loss functions by setting the --loss_type
argument before training. Optional loss types include:ATL, balance_softmax, AFL, SAT, MeanSAT, HingeABL, AML
.
Note: This code is partially based on the code of ATLOP.