The official implementation of the "TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis" paper, accepted at CVPR 2024
To be able to run the code, install the dependencies by running the following:
conda create -n tetrasphere
conda deactivate
conda activate tetrasphere
conda install python=3.9
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
conda install h5py scikit-learn future tqdm wget
pip install tensorboardx pytorch_lightning torchmetrics datetime
Inspect tetrasphere/config.py
to find the preset paths to the datasets, and edit if you like.
- The datasets can be downloaded with a convenience application:
cd tetrasphere python download_datasets.py
If you want to download them manually, simply use the links below.
-
ModelNet-40 can be downloaded here.
-
To acquire the ScanObjectNN dataset, download the file h5_files.zip from here. (For reference, the download link was provided by the authors here.)
-
To get the ShapeNet-Part you should register on the shapenet.org webpage. However, as this dataset seems to be inaccessible through browsing the website, we found this link in the GitHub repos of multiple authors working on point cloud segmentation.
To run the experiments, navigate to tetrasphere/experiments/
.
-
Point cloud classification:
python train_mn40.py
- ModelNet40python train_objbg.py
- ScanObjectNN,objbg
variantpython train_pbt50rs.py
- ScanObjectNN,pb-t50-rs
variant -
Part segmentation:
python train_partseg.py
-
Pretrained weights for the four experiments are included in the
weights/
directory. To test these, runpython evaluate_all.py
Due to the default usage of float32
with torch
, the pairwise distances in the point clouds might slightly (by up to pb-t50-rs
).
Thus, the knn
function as the part of the baseline (VN-)DGCNN sometimes returns different nearest neighbors for a rotated input, which technically breaks rotation-equivariance within the VN layers in the network.
Changing the precision to float64
rectifies this in most cases.
However, since the original training was conducted with float32
, in rare cases the accuracy of TetraSphere and the baseline may insignificantly vary depending on the input orientation (by up to
Please cite if you find the code/paper useful:
@InProceedings{Melnyk_2024_CVPR,
author = {Melnyk, Pavlo and Robinson, Andreas and Felsberg, Michael and Wadenb\"ack, M\r{a}rten},
title = {TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {5620-5630}
}