Awesome Egocentric Human Pose

A list of the awesome egocentric human body pose estimation works and related resources. While some repositories awesome-egocentric-vision compile studies across the wide field of egocentric vision, none specifically focus on the niche area of egocentric human body pose estimation.

We split this topic by different capture setups:

Egocentric Inside-In Pose Estimation
Egocentric Inside-Out Pose Estimation
IMU-Based Egocentric Pose Estimation
Headset-Based Egocentric Pose Estimation
Third-Person View Egocentric Pose Estimation
Mixed Setup

Egocentric Inside-In Pose Estimation

The inside-in vision setup involves cameras or sensors directed toward the person or object of interest, capturing data from the inside of the motion capture subject. This setup can be seen on the Oculus Quest2 and Apple Vision Pro.

Training Datasets (bold means recommended to use)

Setup	Dataset	Number of Frames	Synthetic or Real	Actor Number	Scene Annotation	FPS	Link
Monocular Fisheye	Mo2Cap2[2019-2]	530K	Synthetic	-	No	-	Link
	xR-egopose[2019-3]	252K Train + 16 Val	Synthetic	34	No	30	Link
	EgoPW[2022-1]	318K	Real (pseudo gt)	10	No	25	Link
	EgoPW-Scene[2023-1]	92K	Real (pseudo gt)	10	Pseudo Annotations	25	Link
	EgoWholeBody[2023-5]	700K	Synthetic	14	No	30	-
Stereo Perspecive	EgoGlass[2021-3]	172K	Real	10	No	30	-
Stereo Fisheye	UnrealEgo[2022-2]	450K * 2 views	Synthetic	17	No	25	Link
Stereo Fisheye	UnrealEgo2[2024-2]	1.25M * 2 views	Synthetic	17	Yes	25	-

Evaluation Datasets (bold means recommended to use)

Setup	Dataset	Number of Frames	Synthetic or Real	Scene Annotation	FPS	Dataset Link	Leader Board
Monocular Fisheye	Mo2Cap2[2019-2]	5K	Real	No	25	Link	-
	xR-egopose[2019-3]	115K	Synthetic	No	30	Link	-
	GlobalEgoMocap[2021-2]	318K	Real	No	25	Link	Paper With Code
	SceneEgo[2023-1]	28K	Real	Yes	25	Link	Paper With Code
	EgoWholeBody[2023-5]	133K	Synthetic	No	30	-	-
Stereo Fisheye	UnrealEgo[2022-2]	48K * 2 views	Synthetic	No	25	Link	Paper With Code
	UnrealEgo2[2024-2]	123K * 2 views	Synthetic	Yes	25	-	-
	UnrealEgo2-RW[2024-2]	130K * 2 views	Real	Yes	25	-	-

Papers

2019 and Before

Rhodin, Helge, et al. "Egocap: egocentric marker-less motion capture with two fisheye cameras." ACM Transactions on Graphics (TOG) 35.6 (2016): 1-11. [project page]
Xu, Weipeng, et al. "Mo2cap2: Real-time mobile 3d motion capture with a cap-mounted fisheye camera." IEEE transactions on visualization and computer graphics 25.5 (2019): 2093-2101. [project page]
Tome, Denis, et al. "xr-egopose: Egocentric 3d human pose from an hmd camera." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. [dataset]

2021

Zhang, Yahui, Shaodi You, and Theo Gevers. "Automatic calibration of the fisheye camera for egocentric 3d human pose estimation from a single image." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021.
Wang, Jian, et al. "Estimating egocentric 3d human pose in global space." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. [project page] [dataset] [demo]
Zhao, Dongxu, et al. "Egoglass: Egocentric-view human pose estimation from an eyeglass frame." 2021 International Conference on 3D Vision (3DV). IEEE, 2021.

2022

Wang, Jian, et al. "Estimating egocentric 3d human pose in the wild with external weak supervision." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. [project page] [dataset] [demo]
Akada, Hiroyasu, et al. "UnrealEgo: A new dataset for robust egocentric 3d human motion capture." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022. [project page] [code] [dataset] [demo]
Park, Jinman, et al. "Building Spatio-temporal Transformers for Egocentric 3D Pose Estimation." arXiv preprint arXiv:2206.04785 (2022).
Liu, Yuxuan, et al. "Ego+ X: An Egocentric Vision System for Global 3D Human Pose Estimation and Social Interaction Characterization." 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022.

2023

Wang, Jian, et al. "Scene-aware Egocentric 3D Human Pose Estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. [project page] [dataset] [code]
Liu, Yuxuan, et al. "EgoHMR: Egocentric Human Mesh Recovery via Hierarchical Latent Diffusion Model." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.
Liu, Yuxuan, et al. "EgoFish3D: Egocentric 3D Pose Estimation from a Fisheye Camera via Self-Supervised Learning." IEEE Transactions on Multimedia (2023).
Kang, Taeho, et al. "Ego3DPose: Capturing 3D Cues from Binocular Egocentric Views." SIGGRAPH Asia 2023 Conference Papers. 2023.
Wang, Jian, et al. "Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement." arXiv preprint arXiv:2311.16495 (2023).

2024

Cuevas-Velasquez, Hanz, et al. "SimpleEgo: Predicting Probabilistic Body Pose from Egocentric Cameras." arXiv preprint arXiv:2401.14785 (2024).
Akada, Hiroyasu, et al. "3D Human Pose Perception from Egocentric Stereo Videos." arXiv preprint arXiv:2401.00889 (2024).

Egocentric Inside-Out Pose Estimation

The inside-out vision setup employs cameras or sensors positioned on the person or device, looking outward to the environment. This approach is commonly used in most virtual reality (VR) headsets and augmented reality (AR) systems, where cameras attached to the headset capture the user's surroundings and interpret motion relative to them.

Datasets

Papers

Ego-Body Pose Estimation via Ego-Head Pose Estimation - Jiaman Li · Karen Liu · Jiajun Wu. In CVPR 2023.
You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions - Evonne Ng, Donglai Xiang, Hanbyul Joo, and Kristen Grauman. In CVPR 2020. [demo] [project page] [dataset] [code]
Ego-Pose Estimation and Forecasting as Real-Time PD Control - Ye Yuan and Kris Kitani. In ICCV 2019. [code] [project page] [demo]
Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video - Hao Jiang and Kristen Grauman. In CVPR 2017.

IMU-Based Egocentric Pose Estimation

The Inertial Measurement Unit (IMU) setup utilizes sensors typically composed of accelerometers, gyroscopes, and sometimes magnetometers. In egocentric motion capture, IMUs are placed on the human body to capture dynamic motion and limb orientation changes.

Datasets

Papers

Headset-Based Egocentric Pose Estimation

Some methods use the headset 6dof pose (head pose) and VR controller 6dof pose (hand pose) to estimate full body pose. The hand and head poses come from the headset SLAM and VR controller, the input signal is much less noisy than the IMU setup.

Datasets

Papers

Third-Person View Egocentric Pose Estimation

The third-person setup refers to motion capture techniques that involve a third person carrying moving cameras observing the motion capture subject.

Datasets

Papers

EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices - Siwei Zhang, Qianli Ma, Yan Zhang, Zhiyin Qian, Taein Kwon, Marc Pollefeys, Federica Bogo, Siyu Tang. In ECCV 2022. [project page] [dataset] [code]

Mixed Setup

Combination of aforementioned setups.

jianwang-mpi/awesome-egocentric-pose

Awesome Egocentric Human Pose

Contents

Egocentric Inside-In Pose Estimation

Training Datasets (bold means recommended to use)

Evaluation Datasets (bold means recommended to use)

Papers

2019 and Before

2021

2022

2023

2024

Egocentric Inside-Out Pose Estimation

Datasets

Papers

IMU-Based Egocentric Pose Estimation

Datasets

Papers

Headset-Based Egocentric Pose Estimation

Datasets

Papers

Third-Person View Egocentric Pose Estimation

Datasets

Papers

Mixed Setup