This is not an officially supported Google product.
Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble (CVPR 2023)
Project Page | Video | Paper
Implementation for Hi-LASSIE. A novel method which discovers 3D skeleton and detailed shapes of articulated animal bodies from sparse unannotated images in-the-wild.
Chun-Han Yao1, Wei-Chih Hung2, Yuanzhen Li3, Michael Rubinstein3, Ming-Hsuan Yang134
, Varun Jampani3
1UC Merced, 2Waymo, 3Google Research, 4Yonsei University
This repo is largely based on LASSIE. A python virtual environment is used for dependency management. The code is tested with Python 3.7, PyTorch 1.11.0, CUDA 11.3. First, to install PyTorch in the virtual environment, run:
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
Then, install other required packages by running:
pip install -r requirements.txt
- Download LASSIE images following here and place them in
data/lassie/images/
. - Download LASSIE annotations following here and place them in
data/lassie/annotations/
. - Preprocess LASSIE images and extract DINO features of an animal class (e.g. zebra) by running:
python preprocess_lassie.py --cls zebra
To accelerate feature clustering, try setting number of threads lower (e.g. OMP_NUM_THREADS=4
).
- Download Pascal images here and place them in
data/pascal_part/JPEGImages/
. - Download Pascal-part annotations here and place them in
data/pascal_part/Annotations_Part/
. - Download Pascal-part image sets following here and place them in
data/pascal_part/image-sets/
. - Preprocess Pascal-part images and extract DINO features of an animal class (e.g. horse) by running:
python preprocess_pascal.py --cls horse
After preprocessing the input data, we extract a 3D skeleton from a specified reference image in the ensemble. For instance, run the following to use the 5-th instance as reference:
python extract_skeleton.py --cls zebra --idx 5
To obtain better 3D outputs, we recommend selecting an instance where most body parts are visible (e.g. clear side-view). One can also see the DINO feature clustering results in results/zebra/
to select a good reference or try running the optimization with different skeletons.
To run Hi-LASSIE optimization on all images in an ensemble jointly, first run:
python train.py --cls zebra
After the joint optimization, we perform instance-specific fine-tuning on a particular instance (e.g. 0) by:
python train.py --cls zebra --inst True --idx 0
The qualitative results can be found in results/zebra/
. The optimization settings can be changed in main/config.py
. For instance, one can reduce the rendering resolution by setting input_size
if out of memory.
Once optimization on all instances is completed, quantitative evaluation can be done by running:
python eval.py --cls zebra
For the animal classes in LASSIE dataset, we report the keypoint transfer accuracy (PCK). For Pascal-part animals, we further calculate the 2D IoU against ground-truth masks.
@inproceedings{yao2023hi-lassie,
title = {Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble},
author = {Yao, Chun-Han and Hung, Wei-Chih and Li, Yuanzhen and Rubinstein, Michael and Yang, Ming-Hsuan and Jampani, Varun},
booktitle = {CVPR},
year = {2023},
}