MGTR

[ACCV 2022] An official implement of the paper MGTR: End-to-end Mutual Gaze Detection with Transformer.

📑 Dependencies

Python >= 3.6 (Recommend to use Anaconda)
PyTorch >= 1.7.1
TorchVision>=0.8.2
NVIDIA GPU + CUDA
Opencv-python>=4.5.1

💕 Performance on mAP

Model	AVA-LAEO	UCO-LAEO
MGTR	66.2	64.8

👀 Visualization

😀 Quick Start

Clone this github repo.

git@github.com:Gmbition/MGTR.git
cd MGTR

Download Mutual Gaze Datasets from Baidu Drive and put the annotation json files to ./data.
- AVA-LAEO[5.18G] password: ava1
- UCO-LAEO[3.84G] password: uco1
Download our trained model from here and move them to ./data/mgtr_pretrained(need to creat this new mgtr_pretrained file).

Run testing for MGTR.

python3 test.py --backbone=resnet50 --batch_size=1 --log_dir=./ --model_path=your_model_path

The visualization results (if set save_image = True) will be sorted in ./log.

📖 Annotations

We annotate each mutual gaze instance in one image as a dict and the annoataion is stored in ./data. There are four annotation json files for AVA-LAEO and UCO-LAEO training and testing respectively. The specific format of one mutual gaze instance annoatation is as follow:

{
"file_name": "scence_name/image.jpg",
"width": width of the image,
"height": height of the image, 
"gt_bboxes": [{"tag": 1, 
               "box": a list containing the [x,y,w,h] of the box},
               ...],
"laeo": [{"person_1": the idx of person1, 
          "person_2": the idx of person2, 
          "interaction": whether looking at each other}]
}

📘 Citation

Please cite us if this work is helpful to you.

@inproceedings{guo2022mgtr,
  title={MGTR: End-to-End Mutual Gaze Detection with Transformer},
  author={Guo, Hang and Hu, Zhengxi and Liu, Jingtai},
  booktitle={Proceedings of the Asian Conference on Computer Vision},
  pages={1590--1605},
  year={2022}
}

😊 Acknowledgement

We sincerely thank the cool work by very cool people 😎 DETR, HoiTransformer.

csguoh/MGTR