
Scene-aware Egocentric 3D Human Pose Estimation

Official implementation of paper:

Jian Wang, Diogo Luvizon, Weipeng Xu, Lingjie Liu, Kripasindhu Sarkar, Christian Theobalt

CVPR 2023

[Project Page] [SceneEgo Datasets (Test split)] [SceneEgo Datasets (Train split)] [EgoGTA] [EgoPW-Scene]

Demo image

Annotation format in Test dataset

The annotation of the dataset is saved in "annotation.pkl" of each sequence. Load the pickle file with:

with open('annotation.pkl', 'rb') as f:
    data = pickle.load(f)

The data is a Python list, each item is a Python dict containing the annotations:

  • ext_id: the annotation id of external multiview mocap system;
  • calib_board_pose: the 6d pose of the calibration board on the head;
  • ego_pose_gt: the ground truth human body pose under the egocentric camera coordinate system, the joint sequence is: Neck, Right Shoulder, Right Elbow, Right Wrist, Left Shoulder, Left Elbow, Left Wrist, Right Hip, Right Knee, Right Ankle, Right Toe, Left Hip, Left Knee, Left Ankle, Left Toe;
  • ext_pose_gt: the human pose ground truth in the mocap system coordinate;
  • image_name: name of image under directory "imgs";
  • ego_camera_matrix: the 6d pose of the egocentric camera on the head.

The id of the egocentric camera can also be obtained with the synchronization file with:

with open('syn.json', 'r') as f:
    syn_data = json.load(f)

ego_start_frame = syn_data['ego']
ext_start_frame = syn_data['ext']
ego_id = ext_id - ext_start_frame + ego_start_frame
egocentric_image_name = "img_%06d.jpg" % ego_id


  1. Create a new anaconda environment
conda create -n sceneego python=3.9

conda activate sceneego
  1. Install pytorch 1.13.1 from https://pytorch.org/get-started/previous-versions/

  2. Install other dependencies

pip install -r requirements.txt

Run the demo

  1. Download pre-trained pose estimation model and put it under models/sceneego/checkpoints

  2. run:

python demo.py --config experiments/sceneego/test/sceneego.yaml --img_dir data/demo/imgs --depth_dir data/demo/depths --output_dir data/demo/out --vis True

The result will be shown with the open3d visualizer and the predicted pose is saved at data/demo/out.

  1. The predicted pose is saved as the pkl file (e.g. img_001000.jpg.pkl). To visualize the predicted result, run:
python visualize.py --img_path data/demo/imgs/img_001000.jpg --depth_path data/demo/depths/img_001000.jpg.exr --pose_path data/demo/out/img_001000.jpg.pkl

The result will be shown with the open3d visualizer.

Test on your own dataset

If you want to test on your own dataset, after obtaining egocentric frames, you need to:

  1. Run the egocentric human body segmentation network to get the human body segmentation for each frame:

    See repo: Egocentric Human Body Segmentation

  2. Run the depth estimator to get the scene depth map for each frame:

    See repo: Egocentric Depth Estimator


