/IM-NET

The improved code for paper "Learning Implicit Fields for Generative Shape Modeling".

Primary LanguagePythonOtherNOASSERTION

IM-NET

The improved tensorflow code for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang.

Improvements

Encoder:

  • In IM-AE (autoencoder), changed batch normalization to instance normalization.

Decoder (=generator):

  • Changed the first layer from 2048-1024 to 1024-1024-1024.
  • Changed latent code size from 128 to 256.
  • Removed all skip connections.
  • Changed the last activation function from sigmoid to clip ( max(min(h, 1), 0) ).

Training:

  • Trained one model on the 13 ShapeNet categories as most Single-View Reconstruction networks do.
  • For each category, sort the object names and use the first 80% as training set, the rest as testing set, same as AtlasNet.
  • Reduced the number of sampled points by half in the training set. Points were sampled on 2563 voxels.
  • Removed data augmentation (image crops), same as Occupancy Networks.
  • Added coarse-to-fine sampling for inference to speed up testing.
  • Added post-processing to make the output mesh smoother. To enable, find and uncomment all "self.optimize_mesh(vertices,model_z)".

Citation

If you find our work useful in your research, please consider citing:

@article{chen2018implicit_decoder,
  title={Learning Implicit Fields for Generative Shape Modeling},
  author={Chen, Zhiqin and Zhang, Hao},
  journal={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

Dependencies

Requirements:

Our code has been tested with Python 3.5, TensorFlow 1.8.0, CUDA 9.1 and cuDNN 7.0 on Ubuntu 16.04 and Windows 10.

Datasets and pre-trained weights

The original voxel models are from HSP.

The rendered views are from 3D-R2N2.

Since our network takes point-value pairs, the voxel models require further sampling.

For data preparation, please see directory point_sampling.

We provide the ready-to-use datasets in hdf5 format, together with our pre-trained network weights.

Backup links:

We also provide pointcloud files with normal for the shapes in ShapeNet. The points are sampled from only the surface of the shapes. We use floodfilled 2563 voxel files in HSP to determine whether a point is inside a shape or on its surface.

Backup links:

Evaluation results

The quantitative evaluation results comparing to AtlasNet and Occupancy Networks are as follows. We report Chamfer Distance, Normal Consistency on the pointclouds mentioned above, and the Light Field Distance (LFD) produced by LightField descriptor.

(Note that the code for LightField descriptor is written in C and the executable only does closest shape retrieval according to the Light Field Distance. If you want to use it in your own experiments, you might need to change some lines to get the actual distance and recompile the code.)

Usage

To train an autoencoder, go to IMGAN and use the following commands for progressive training. You may want to copy the commands in a .bat or .sh file.

python main.py --ae --train --epoch 100 --real_size 16 --batch_size_input 4096
python main.py --ae --train --epoch 200 --real_size 32 --batch_size_input 4096
python main.py --ae --train --epoch 400 --real_size 64 --batch_size_input 16384

The above commands will train the AE model 100 epochs in 163 resolution (each shape has 4096 sampled points), then 100 epochs in 323 resolution, and finally 200 epochs in 643 resolution. Training on the 13 ShapeNet categories takes about 4 days on one GeForce RTX 2080 Ti GPU.

To train a latent-gan, after training the autoencoder, use the following command to extract the latent codes:

python main.py --ae

Then train the latent-gan and get some samples:

python main.py --train --epoch 10000
python main.py

You can change some lines in main.py to adjust the number of samples and the sampling resolution.

To train the network for single-view reconstruction, after training the autoencoder, copy the weights and latent codes to the corresponding folders in IMSVR. Go to IMSVR and use the following commands to train IM-SVR and get some samples:

python main.py --train --epoch 2000
python main.py

License

This project is licensed under the terms of the MIT license (see LICENSE for details).