Appearance-based Driver 3D Gaze Estimation Using GRM and Mixed Loss Strategies

Primary LanguagePythonApache License 2.0Apache-2.0


Appearance-based Driver 3D Gaze Estimation Using GRM and Mixed Loss Strategies.

Inference results on the dmd dataset:


Inference results on the driving simulator:



Two projects were provided for leave-one-person-out evaluation and the evaluation of common training-test split. They have the same architecture but different train.py and test.py.

Each project contains following files/folders.

  • model.py, the model code.
  • train.py, the entry for training.
  • test.py, the entry for testing.
  • config/, this folder contains the config of experiments for each dataset. To run our code, you should write your own config.yaml.
  • reader/, the data loader code. You can use the provided reader or write your own reader.

Getting Started

For train, you should change:

  1. train.save.save_path, The model is saved in the $save_path$/checkpoint/.
  2. train.data.image, This is the path of image, please use the provided data processing code.
  3. train.data.label, This is the path of label.
  4. reader, This indicates the used reader. It is the filename in reader folder, e.g., reader/reader_mpii.py ==> reader: reader_mpii.

For test, you should change:

  1. test.load.load_path, it is usually the same as train.save.save_path. The test result is saved in $load_path$/evaluation/.
  2. test.data.image, it is usually the same as train.data.image.
  3. test.data.label, it is usually the same as train.data.label.

Data preprocessing

Data preprocessing is for step 2 of the Getting Started training.

1.On the preprocessing of the MPIIFaceGaze dataset The MPIIFaceGaze dataset can be downloaded. The code contains following parameters:

root = "/home/cyh/dataset/Original/MPIIFaceGaze"
sample_root = "/home/cyh/dataset/Original/MPIIGaze/Origin/Evaluation Subset/sample list for eye image"
out_root = "/home/cyh/dataset/EyeBased/MPIIGaze"

The root is the path of MPIIFaceGaze. The sample_root indicates the sample list in MPIIGaze. Note that, this file is not contained in MPIIFaceGaze. You should download MPIIGaze for this file. The out_root is the path for saving result. To use the code, you should set the three parameters first., and run:

cd data processing
python data_processing_mpii.py

2.On the preprocessing of the Gaze360 dataset The Gaze360 dataset can be downloaded.

The code contains following parameters:

root = "/home/cyh/dataset/Original/Gaze360/"
out_root = "/home/cyh/dataset/FaceBased/Gaze360"

The root is the path of original Gaze360 dataset. The out_root is the path for saving result file. To use the code, you should first set the two paramters, and run

cd data processing
python data_processing_gaze360.py

We are grateful to GazeHub@Phi-ai Lab for their contributions to the data preprocessing.


The requirements are listed in the requirement.txt file. To create your own environment, an example is:

pip install -r requirements.txt


Training on the MPIIFaceGaze dataset, you can run in the leaveout folder:

cd mp2/Leaveout
python train.py config/config_mpii.yaml 0


cd mp2/Leaveout
bash run.sh train.py config/config_mpii.yaml

Training on Gaze 360 dataset, you can run in the traintest folder:

cd Gaze360/Traintest
python train.py config/config_mpii.yaml


Testing on the MPIIFaceGaze dataset, you can run in the leaveout folder: Since the training and testing on the MPIIFaceGaze dataset follows the LOPO strategy, I suggest to save the checkpoints for each epoch for 15 labels and test them to get better test results.

cd mp2/Leaveout
python test.py config/config_mpii.yaml 0


bash run.sh test.py config/config_mpii.yaml

Testing on Gaze 360 dataset, you can run in the traintest folder:

python test.py config/config_mpii.yaml


One of the models in this work is publicly available Iter_60_mp2 and can be used directly for simple inference.

For inference videos or images, please run:

cd inference
python inference-1.py

For inference using the local camera, please run:

cd inference
python inference-2.py


After training or test, you can find the result from the save_path in config_mpii.yaml.


Our work is based on Swin transformer, Gaze360 and Fullface. We appreciate the previous open-source repository Swin Transformer, Gaze360 and Fullface.

Please follow their outstanding work:

  title={Swin transformer: Hierarchical vision transformer using shifted windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},

        title={Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark},
        author={Yihua Cheng and Haofei Wang and Yiwei Bao and Feng Lu},
        journal={arXiv preprint arXiv:2104.12668},

	author = {Kellnhofer, Petr and Recasens, Adria and Stent, Simon and Matusik, Wojciech and Torralba, Antonio},
	title = {Gaze360: Physically Unconstrained Gaze Estimation in the Wild},
	booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
	month = {October},
	year = {2019}

	title={It’s written all over your face: Full-face appearance-based gaze estimation},
	author={Zhang, Xucong and Sugano, Yusuke and Fritz, Mario and Bulling, Andreas},
	booktitle={The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},