[ACCV 2022] An official implement of the paper MGTR: End-to-end Mutual Gaze Detection with Transformer.
- Python >= 3.6 (Recommend to use Anaconda)
- PyTorch >= 1.7.1
- TorchVision>=0.8.2
- NVIDIA GPU + CUDA
- Opencv-python>=4.5.1
Model | AVA-LAEO | UCO-LAEO |
---|---|---|
MGTR | 66.2 | 64.8 |
-
Clone this github repo.
git@github.com:Gmbition/MGTR.git cd MGTR
-
Download Mutual Gaze Datasets from Baidu Drive(coming soon~).
-
Download our trained model from here and move them to
./data/mgtr_pretrained
(need to creat this newmgtr_pretrained
file). -
Run testing for MGTR.
python3 test.py --backbone=resnet50 --batch_size=1 --log_dir=./ --model_path=your_model_path
-
The visualization results (if set
save_image = True
) will be sorted in./log
.
We annotate each mutual gaze instance in one image as a dict and the annoartion is stored in ./data
. There are four annotation json files for AVA-LAEO and UCO-LAEO training and testing respectively. The specific format of one mutual gaze instance annoatation is as follow:
{
"file_name": "scence_name/image.jpg",
"width": width of the image,
"height": height of the image,
"gt_bboxes": [{"tag": 1,
"box": a list containing the [x,y,w,h] of the box},
...],
"laeo": [{"person_1": the idx of person1,
"person_2": the idx of person2,
"interaction": whether looking at each other}]
}
We sincerely thank the cool work by very cool people 😎 DETR, HoiTransformer.