This repository is built for the source code of paper -- MORE: A METRIC LEARNING BASED FRAMEWORK FOR OPEN-DOMAIN RELATION EXTRACTION . You can follow the steps below to use our code.
To run the code, you need :
torch>=1.3.0
tensorflow>=1.9.0
keras>=2.2.5
transformers==3.2.0
You can run the following command to set up a new Anaconda environment:
conda create -n more python=3.6
pip install -r ./requirements.txt
We suggest that you use the same environment as ours to avoid any problems.
In this code, we use two real-world RE datasets:
-
FewRel : We follow RSNs. The processed dataset is already in ./data/datasets/fewrel_ori/ .
-
NYT+FB-sup: We use the original NYT+FB and process it to NYT+FB-sup. The dataset is not open source, but you can get the sample if you need.
To process nyt_ori.txt (suppose you already own it and store it in the ./data/datasets/nyt_su/ ), run the following command:
python ./data/datasets/nyt_su/process2json.py python ./data/datasets/nyt_su/nyt_divide_supervision.py
then the original .txt file will be processed into .json format and be divided into train\dev\test(6:2:2).
In our experiments, we use CNN and BERT for our extractor. The architecture of CNN is same as RSNs used, and the pre-trained language model we exploit is huggingface transformers.
-
On FewRel:
- MORE(CNN)
python main_cmd.py --dataset fewrel
- MORE(CNN)+VAT
python main_cmd.py --dataset fewrel --VAT 1 --epoch_num 4 --warm_up 3 --power_iterations 1 --p_mult 0.03 --lambda_V 1
- MORE(BERT)
python main_cmd.py --dataset fewrel --learning_rate 0.00001 --batch_num 1000 --BERT 1
-
On NYT+FB-sup:
- MORE(CNN)
python main_cmd.py --dataset nyt
- MORE(CNN)+VAT
python main_cmd.py --dataset nyt --VAT 1 --epoch_num 6 --warm_up 4 --power_iterations 1 --p_mult 0.5 --lambda_V 1.5
- MORE(BERT)
python main_cmd.py --dataset nyt --learning_rate 0.00001 --batch_num 1000 --BERT 1
Note that if you have enough computing resources, you can try to use MORE(BERT)+VAT (We didn't list this result on paper due to the limitation of GPU memory) :
python main_cmd.py --dataset fewrel --VAT 1 --epoch_num 4 --warm_up 0 --power_iterations 1 --p_mult 0.03 --lambda_V 1 --learning_rate 0.00001 --batch_num 1000 --BERT 1
python main_cmd.py --dataset nyt --VAT 1 --epoch_num 4 --warm_up 0 --power_iterations 1 --p_mult 0.5 --lambda_V 1.5 --learning_rate 0.00001 --batch_num 1000 --BERT 1
- Optimize virtual adversarial training.
- Complete MORE(BERT)+VAT.