For better and more robust reconstruction of quadreped animals and human, please check out BANMo.
- 07/31/22: Use a larger laplacian smoothness loss by default.
- 05/16/22: Fix flip bug in flow pre-computation.
- 05/22/22: Fix bug in flow rendering that causes self-intersection.
conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../
Create folders to store intermediate data and training logs
mkdir log; mkdir tmp;
Download pre-processed data (rgb, mask, flow) following the link
here
and unzip under ./database/DAVIS/
. The dataset is organized as:
DAVIS/
Annotations/
Full-Resolution/
sequence-name/
{%05d}.png
JPEGImages/
Full-Resolution/
sequence-name/
{%05d}.jpg
FlowBW/ and FlowFw/
Full-Resolution/
sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
flo-{%05d}.pfm
occ-{%05d}.pfm
visflo-{%05d}.jpg
warp-{%05d}.jpg
To run preprocessing scripts on other videos, see here.
Run
bash scripts/breakdance-flare.sh
To monitor optimization, run
tensorboard --logdir log/
To render optimized breakdance-flare
bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36
Example outputs:
To optimize dance-twirl, check out scripts/dance-twirl.sh
.
Run
bash scripts/elephants.sh
To monitor optimization, run
tensorboard --logdir log/
To render optimized shapes
bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth 36
Example outputs:
elephant-walk-all.mp4
elephant0009-all.mp4
elephant0058-all.mp4
Download sample results
wget https://www.dropbox.com/s/4bne43yxp89aleu/breakdance-results.zip
unzip breakdance-results.zip
Run evaluation
python eval_pck.py --testdir log/rbreakdance-flare-viser/ --seqname breakdance-flare --type mesh
This should return the result of PCK: 70.52% (Tab 1 of the paper, break-1.)
To evalute on other sequences, change $seqname to {breakdance, dance-twirl, parkour} etc.
The annotated keypoints are stored at database/joint_annotations
.
The results to be evaluated should be stored at $testdir, and contain meshes and camera paramters in the following format.
# $seqname-pred%d.ply/ # mesh (V,F)
# $seqname-cam%d.txt/ # camera
# [R_3x3|T_3x1] # V'=RV+T should be in the view space
# [fx,fy,px,py] # in pixel
Multi-GPU training
By default we use 1 GPU. The codebase also supports single-node multi-gpu training with pytorch distributed data-parallel.
Please modify dev
and ngpu
in scripts/xxx.sh
to select devices.
Potential bugs
- When setting batch_size to 3, rendered flow may become constant values.
The code borrows the skeleton of CMR
External repos:
To cite our paper
@inproceedings{yang2021viser,
title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
author={Yang, Gengshan
and Sun, Deqing
and Jampani, Varun
and Vlasic, Daniel
and Cole, Forrester
and Liu, Ce
and Ramanan, Deva},
booktitle = {NeurIPS},
year={2021}
}
@inproceedings{yang2021lasr,
title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
author={Yang, Gengshan
and Sun, Deqing
and Jampani, Varun
and Vlasic, Daniel
and Cole, Forrester
and Chang, Huiwen
and Ramanan, Deva
and Freeman, William T
and Liu, Ce},
booktitle={CVPR},
year={2021}
}
- code clean up