Goals: In this assignment, you will explore the types of loss and decoder functions for regressing to voxels, point clouds, and mesh representation from single view RGB input.
This repo just serve as a template, feel free to adjust most of the code if needed.
├── configs # Where Hydra reads your configurations.
│ └── config.yml
├── data
│ ├── chair_img_pc_voxel_mesh
│ └── chair_img_pc_voxel_mesh.zip
├── notebooks # Where your .ipynb files placed.
│ ├── scene.py
│ └── synth.ipynb # Synthesize datas from ShapeNet mesh.
├── src # Where your modules placed.
│ ├── dataset.py # Define your custom PyTorch dataset.
│ ├── losses.py # Define your custom loss functions.
│ └── model.py # Define your model structure.
├── README.md
├── requirements.yml # Create your conda env with this file.
├── eval.py # Evaluate your model.
└── train.py # Train your model.
Click the green Use this template
button to fork this repo to your github and change it to private repo.
Or git clone git@github.com:nctu-eva-lab/3DV-2022.git
to only have a local copy.
Create a conda env:
conda env create --file requirements.yml
Download dataset from here
Or use gdown1 to download the google drive files with command line in your conda env you just created.
conda activate py3d
gdown 1UsyZT0n4KCCfr7EB-jJFoSZyJgaeVGcc
After downloaded, put the zip file under data/
and unzip it.
This section will involve defining a loss function, for fitting voxels, point clouds and meshes.
In this subsection, we will define binary cross entropy loss that can help us fit a 3D binary voxel grid. Define the loss functions in src/losses.py
file. For this you can use the pre-defined losses in pytorch library.
In this subsection, we will define chamfer loss that can help us fit a 3D point cloud. Define the loss functions here in src/losses.py
file. We encourage you to write your own code for this and not use any pytorch3d2 utilities, but you can still use it if you have no idea how to do it.
This section will involve training a single view to 3D pipeline for voxels, point clouds and meshes.
In this subsection, we will define a neural network to decode binary voxel grids. Define the decoder network in src/model.py
file.
Run the file python train.py dtype='voxel'
, to train single view to voxel grid pipeline, feel free to tune the hyperparameters as per your need.
After trained, visualize the input RGB, ground truth voxel grid and predicted voxel in eval.py
file using: python eval.py dtype='voxel'
.
You need to add the respective visualization code in eval.py
to show both the predicted voxel and the mesh side by side.
In this subsection, we will define a neural network to decode point clouds. Similar as above, define the decoder network in src/model.py
file.
Run the file python train.py dtype='point'
, to train single view to pointcloud pipeline, feel free to tune the hyperparameters as per your need.
After trained, visualize the input RGB, ground truth point cloud and predicted point cloud in eval.py
file using: python eval.py dtype='point'
.
Analyse the results, by varying an hyperparameter of your choice. For example n_points
or vision model
or lr
etc. Try to be unique and conclusive in your analysis.
Feel free to file an issue if you think this template has a major flaw.
※ Noted: The API docs might still not be complete, it's essential to trace the source code in their own project github repo.