/harp

HARP: Personalized Hand Reconstruction from a Monocular RGB Video

Primary LanguagePython

HARP: Personalized Hand Reconstruction from a Monocular RGB Video

CVPR 2023

Korrawe Karunratanakul, Sergey Prokudin, Otmar Hilliges, Siyu Tang
ETH Zurich

harp_teaser

report report

Updates

  • June 20, 2023: Initial release with sample preprocessed data.
  • July 3, 2023: How to process new video. Added all preprocessed data from subject one.

Running the code

Dependencies

The easiest way to run the code is to use conda.

The code is tested on Ubuntu 20.04 with python 3.8 and 3.9.

Installation with python 3.9
  1. Create a conda env with python 3.9:

    conda create -n harp python=3.9 && conda activate harp
    
  2. Install requirements for pytorch3d version 0.6.2:

    conda install pytorch==1.11.0 torchvision==0.12.0 cudatoolkit=11.3 -c pytorch
    conda install -c fvcore -c iopath -c conda-forge fvcore iopath
    
  3. Install pytorch3d version 0.6.2:

    conda install pytorch3d=0.6.2 -c pytorch3d
    
  4. Install other packages:

    pip install -r requirements_reduce.txt
    

For other version of python and pytorch, check a good summary from mJones00 here.

NOTE: As the python requirements for Mesh Transformer and pytorch3D are different, they need to be installed in separate conda environments.

Hand models

  • Download smplx and put it in ./hand_models/.
  • Replace ./hand_models/smplx/smplx/body_models.py and ./hand_models/smplx/smplx/__init__.py with our version in ./hand_models_harp/.
  • Download MANO and put it in the root directory ./mano/ and ./manopth/.

Avatar from preprocessed video

Preprocessed sequence

  • Download the sample preprocessed sequence from here
  • Put the data in ../data/. The path can be changed in utils/config_utils.py.
  • Released data (from one subject, with different appearance variations) can be found here

Running the optimization

To start optimizing the sequence from the coarse initialization, run:

python optmize_sequence.py

The output images are in the exp folder as set in config_utils.py.

Processing new video

MeshTransformer Installation
  1. Install MeshTransformer following their repo
  2. Copy the following files in ./metro_modifications and replace the files in ./MeshTransformer/metro:
    ./MeshTransformer/metro/tools/end2end_inference_handmesh.py
    ./MeshTransformer/metro/hand_utils/hand_utils.py
    ./MeshTransformer/metro/utils/renderer.py
    
  3. Set the SMPLX_PATH in end2end_inference_handmesh.py
  4. Set the new sequence path at L150:

Segmentation and fitting

  • Get the hand segmentation mask using Unscreen or RVM or any other tool. We used RVM for the sample sequence from InterHand2.6M and Unscreen in other cases.
  • Put them in the same structure as in the sample sequence.
  • For output from Unscreen, you can download the .gif file then use ffmpeg to split the video into frames.
    SEQ=name
    mkdir ${SEQ}
    mkdir ${SEQ}/image
    mkdir ${SEQ}/unscreen
    ffmpeg -i ${SEQ}.mp4 -vf fps=30 ${SEQ}/image/%04d.png
    ffmpeg -i ${SEQ}.gif -vsync 0 ${SEQ}/unscreen/%04d.png
    
    • The end2end_inference_handmesh has the option to convert empty background from Unscreen into white background with the flag --do_crop.
  • Run METRO to get the initial hand mesh
    python ./metro/tools/end2end_inference_handmesh.py --resume_checkpoint ./models/metro_release/metro_hand_state_dict.bin --image_file_or_path PATH --do_crop
    
  • (Run by default) Fit the hand model to METRO output. This step is needed as METRO only predicts the vertex locations.
  • Change the path in utils/config_utils.py and run python optmize_sequence.py.

BibTex

@inproceedings{karunratanakul2023harp,
  author    = {Karunratanakul, Korrawe and Prokudin, Sergey and Hilliges, Otmar and Tang, Siyu},
  title     = {HARP: Personalized Hand Reconstruction from a Monocular RGB Video},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2023},
}

Code References

Parts of the code are based on the following repositories. Please consider citing them in the relevant context:

  • The body_models is built on top of the base class from SMPLX which falls under their license.
  • The renderer are based on the renderer from Pytorch3D.
  • The MANO model implementation from Yana Hasson.
  • The hand fitting code from Grasping Field.