/FL-MSRE

A Few-Shot Learning based Approach to Multimodal Social Relation Extraction

Primary LanguagePython

FL-MSRE

Code and data for our paper FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social Relation Extraction (AAAI 2021)

Our code is based on FewRel and FaceNet-pytorch

Prerequisites

  • Pytorch 1.0 or higher, with NVIDIA CUDA Support
  • Python 3.6
  • pillow 7.1.2
  • transformers 3.0.2
  • scikit-learn 0.23.1
  • scipy 1.2.1
  • torchvision 0.4.2

Dataset preparation

We constructed three datasets for multimodal social relation extraction. To replicate the experiments, you need to prepare your dataset as the following. Taking the FC-TF dataset as an example.

FC-TF
├── imgs
│   ├── hlm_xxxx_xxxx.jpg
│   ├── ……
│   └── xyj_xxxx_xxxx.jpg
├── entity_pair_in_img.json
├── img_info.json
├── train_data.json
├── val_data.json
└── test_data.json

You can download all images, and put them under the corresponding datasets for the website.

Pre-trained Checkpoint

Due to the large size, FaceNet and Bert pre-trained checkpoint are not included.

Please download the FaceNet pre-trained checkpoint and Bert pre-trained checkpoint here, and put them in the pretrained under the root.

Here we provide all checkpoints of FL-MSRE.

Code Structure

FL-MSRE
├── datasets
│   ├── FC-TF
│   ├── OM-TF
│   └── DRC-TF
├── fewshot_re_kit
├── models
├── pretrained
│   ├── vggface2
│   	├── 20180402-114759-vggface2-features.pt
│   	└──20180402-114759-vggface2-logits.pt
│   └──  bert_base_chinese
│   	├── config.json
│   	├── pytorch_model.bin
│   	└── vocab.txt
└── train.py
  • train.py: Program Runner
  • fewshot_re_kit
    • face_encoder.py: Face encoder, based on FaceNet
    • sentence_encoder.py: Sentence encoder, based on Bert
    • framework.py: Framework model of FL-MSRE model
  • models
    • proto.py: Prototypcial Network

Training

To train the baseline model with 3-way-1-shot:

python train.py --trainN 3 --N 3 --K 1 --Q 1 --model proto --encoder bert --hidden_size 768 --val_step 10000 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --train_iter 20000 --grad_iter 4 --root_data {DATASET_PATH} --multi_choose 5

To train the FL-MSRE model with 3-way-1-shot, using the face images of entity pair from the same image:

python train.py --trainN 3 --N 3 --K 1 --Q 1 --model proto --encoder bert --hidden_size 768 --val_step 10000 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --train_iter 20000 --grad_iter 4 --root_data {DATASET_PATH} --multi_choose 5 --use_img

The --use_img specifies that combing face images information for social relation extraction.

To train the FL-MSRE model with 3-way-1-shot, using the face images of the entity pair from different images:

python train.py --trainN 3 --N 3 --K 1 --Q 1 --model proto --encoder bert --hidden_size 768 --val_step 10000 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --train_iter 20000 --grad_iter 4 --root_data {DATASET_PATH} --multi_choose 5 --use_img --differ_scene

The --differ_scene specifies that using the face images of the entity pair from different images.

Inference

After training, you can evaluate the baseline model with 3-way-1-shot:

python train.py --trainN 3 --N 3  --K 3 --Q 1 --model proto --encoder bert --hidden_size 768 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --test_iter 10000 --load_ckpt {CKPT_PATH} --root_data {DATASET_PATH} --multi_choose 5 --only_test --use_img --differ_scene

The --only_test specifies that only evaluating the model.

You can also evaluate the FL-MSRE model with 3-way-1-shot:

python train.py --trainN 3 --N 3  --K 3 --Q 1 --model proto --encoder bert --hidden_size 768 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --test_iter 10000 --load_ckpt {CKPT_PATH} --root_data {DATASET_PATH} --multi_choose 5 --use_img --only_test
python train.py --trainN 3 --N 3  --K 3 --Q 1 --model proto --encoder bert --hidden_size 768 --batch_size 2 --pretrain_ckpt {PRETRAINED_CKPT_PATH} --test_iter 10000 --load_ckpt {CKPT_PATH} --root_data {DATASET_PATH} --multi_choose 5 --use_img --differ_scene --only_test