/xR-EgoPose

New egocentric synthetic dataset for egocentric 3D human pose estimation

Primary LanguagePythonOtherNOASSERTION

xR-EgoPose

The xR-EgoPose Dataset has been introduced in the paper "xR-EgoPose: Egocentric 3D Human Pose from an HMD Camera" (ICCV 2019, oral). It is a dataset of ~380 thousand photo-realistic egocentric camera images in a variety of indoor and outdoor spaces.

img

The code contained in this repository is a PyTorch implementation of the data loader with additional evaluation functions for comparison.

Citation

@inproceedings{tome2019xr,
  title={xR-EgoPose: Egocentric 3D Human Pose from an HMD Camera},
  author={Tome, Denis and Peluse, Patrick and Agapito, Lourdes and Badino, Hernan},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={7728--7738},
  year={2019}
}
@ARTICLE{tome2020self,
  author={D. {Tome} and T. {Alldieck} and P. {Peluse} and G. {Pons-Moll} and L. {Agapito} and H. {Badino} and F. {De la Torre}},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title={SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera},
  year={2020},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TPAMI.2020.3029700}
}

The license agreement for the data usage implies citation of the paper. Please notice that citing the dataset URL instead of the publication would not be compliant with this license agreement.

Download on Mac OS and Linux

Make sure pigz and wget are installed:

# on Mac OS
brew install wget pigz
# on Ubuntu
sudo apt-get install pigz

To download and decompress the dataset use the download.sh script:

./download.sh

which will dowload and set-up the dataset folder for training and testing the model. Make sure to have ~1TB free space for storing the data. After that, run demo.py. This shows how to load and evaluate the model.

xR-EgoPose Dataset

Character names in the dataset follow the convention gender_id_body-type_height

  • gender: male/female
  • id: integer
  • body-type: a/f (average/full)
  • height: a/s (average/short)
Train-set Test-set Val-set
female_001_a_a female_004_a_a male_008_a_a
female_002_a_a female_008_a_a
female_002_f_s female_010_a_a
female_003_a_a female_012_a_a
female_005_a_a female_012_f_s
female_006_a_a male_001_a_a
female_007_a_a male_002_a_a
female_009_a_a male_004_f_s
female_011_a_a male_006_a_a
female_014_a_a male_007_f_s
female_015_a_a male_010_a_a
male_003_f_s male_014_f_s
male_004_a_a
male_005_a_a
male_006_f_s
male_007_a_a
male_008_f_s
male_009_a_a
male_010_f_s
male_011_f_s
male_014_a_a

Structure

For each set and for each character the structure is identical, and structured as follows

TrainSet
├── female_001_a_a
│   ├── env 01
│   │   └── cam_down
│   │   	├── depth
│   │   	├── json
│   │   	├── objectId
│   │   	├── rgba
│   │   	├── rot
│   │   	└── worldp
│   ├── ...
│   └── env 03
└── ...

Frame information is organized in different folders, each containing one file per frame

  • depth: 8-bit png per frame
  • json: json file with camera and pose information
  • objectId: semantic segmentation
  • rgba: 8-bit png per frame
  • rot: json file with joint rotations
  • worldp: world position per pixel

Actions

A set of nine broad action categories have been included in the dataset

Action Name
Gaming
Gesticulating
Greeting
Lower Stretching
Patting
Reacting
Talking
Upper Stretching
Walking

where each of those categories is the collection of many different and specific actions.

E.g. Gaming includes Boxing, Shooting Gun, Playing Golf, Playing Baseball just to cite a few.

Results

Action Martinez [1] Ours - single branch Ours - dual branch
Gaming 109.6 138.3 56.0
Gesticulating 105.4 108.5 50.2
Greeting 119.3 100.3 44.6
Lower Stretching 125.8 133.3 51.1
Patting 93.0 117.8 59.4
Reacting 119.7 175.6 60.8
Talking 111.1 93.5 43.9
Upper Stretching 124.5 129.0 53.9
Walking 130.5 131.9 57.7
All (mm) 122.1 130.4 58.2

[1] Julieta Martinez, Rayat Hossain, Javier Romero, and James JLittle. A simple yet effective baseline for 3d human pose estimation. In Proceedings of the International Conference on Computer Vision (ICCV), 2017

License

See the LICENSE file for details.