The repository currently includes demo, training, evaluation code and data for ZeroShape.
If your GPU supports CUDA 10.2, please install the dependencies by running
conda env create --file requirements.yaml
If you need newer CUDA versions, please install the dependencies manually:
conda create -n zeroshape python=3 pytorch::pytorch=1.11 pytorch::torchvision=0.12 cudatoolkit=10.2 (change these to your desired version)
conda install -c conda-forge tqdm pyyaml pip matplotlib trimesh tensorboard
pip install pyrender opencv-python pymcubes ninja timm
To use the segmentation preprocessing tool, please run
pip install rembg
Please download the pretrained weights for shape reconstruction at this url and place it under the weights
folder. We have prepared some images and masks under the examples
folder. To reconstruct their shape, please run:
python demo.py --yaml=options/shape.yaml --task=shape --datadir=examples --eval.vox_res=128 --ckpt=weights/shape.ckpt
The results will be saved under the examples/preds
folder.
To run the demo on your own images and masks, feel free to drop them in the examples
folder. If you do not have mask, please run:
python preprocess.py path-to-your-image
The preprocessed image and mask will be saved in the my_examples
folder. To reconstruct their shape, please run:
python demo.py --yaml=options/shape.yaml --task=shape --datadir=my_examples --eval.vox_res=128 --ckpt=weights/shape.ckpt
If you want to estimate the visible surface (depth and intrinsics), please download the pretrained weights for visible surface estimation at this url and place it under the weights
folder. Then run:
python demo.py --yaml=options/depth.yaml --task=depth --datadir=examples --ckpt=weights/depth.ckpt
The results will be saved under the examples/preds
folder.
Please download our curated training and evaluation data at the following links:
Data | Link |
---|---|
Training Data | this url |
OmniObject3D | this url |
Ocrtoc | this url |
Pix3D | this url |
After extracting the data, organize your data
folder as follows:
data
├── train_data/
| ├── objaverse_LVIS/
| | ├── images_processed/
| | ├── lists/
| | ├── ...
| ├── ShapeNet55/
| | ├── images_processed/
| | ├── lists/
| | ├── ...
├── OmniObject3D/
| ├── images_processed/
| ├── lists/
| ├── ...
├── Ocrtoc/
| ├── images_processed/
| ├── lists/
| ├── ...
├── Pix3D/
| ├── img_processed/
| ├── lists/
| ├── ...
├── ...
Note that you do not have to download all the data. For example, if you only want to perform evaluation on one of the data source, feel free to only download and organize that specific one accordingly.
The first step of training ZeroShape is to pretrain the depth and intrinsics estimator. If you have downloaded the weights already (see demo), you can skip this step and use our pretrained weights at weights/depth.ckpt
. If you want to train everything from scratch yourself, please run
python train.py --yaml=options/depth.yaml --name=run-depth
The visualization and results will be saved at output/depth/run-depth
. Once the training is finished, copy the weights from output/depth/run-depth/best.ckpt
to weights/depth.ckpt
.
To train the full reconstruction model, please run
python train.py --yaml=options/shape.yaml --name=run-shape
The visualization and results will be saved at output/shape/run-shape
.
To evaluate the model on a specific test set (omniobj3d|ocrtoc|pix3d
), please run
python evaluate.py --yaml=options/shape.yaml --name=run-shape --data.dataset_test=name_of_test_set --eval.vox_res=128 --eval.brute_force --eval.batch_size=1 --resume
The evaluation results will be printed and saved at output/depth/run-shape
. If you want to evaluate the checkpoint we provided instead, feel free to create an empty folder output/shape/run-shape
and move weights/shape.ckpt
to output/shape/run-shape/best.ckpt
If you find our work helpful, please consider citing our paper.
@inproceedings{huang2024zeroshape,
author = {Huang, Zixuan and Stojanov, Stefan and Thai, Anh and Jampani, Varun and Rehg, James M},
title = {ZeroShape: Regression-based Zero-shot Shape Reconstruction},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages = {10061--10071},
year = {2024},
}