MATE: Masked Autoencoders are Online 3D Test-Time Learners (ICCV 2023)

MATE is the first 3D Test-Time Training (TTT) method which makes 3D object recognition architectures robust to distribution shifts which can commonly occur in 3D point clouds. MATE follows the classical TTT paradigm of using an auxiliary objective to make the network robust to distribution shifts at test-time. To this end, MATE employs the self-supervised test-time objective of reconstructing aggressively masked input point cloud patches.

In this repository we provide our pre-trained models and codebase to reproduce the results reported in our paper.


PyTorch >= 1.7.0 < 1.11.0  
python >= 3.7  
CUDA >= 9.0  
GCC >= 4.9  

To install all additional requirements (open command line and run):

pip install -r requirements.txt

cd ./extensions/chamfer_dist
python setup.py install --user

cd ..

cd ./extensions/emd
python setup.py install --user
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Data Preparation

Our code currently supports three different datasets: ModelNet40, ShapeNetCore and ScanObjectNN.


To use these datasets with our code, first download them from the following sources:

Then, extract all of these folders into the same directory for easier use.

Adding corruptions to the data

To add distribution shifts to the data, corruptions from ModelNet40-C are used.
For experiments on corrupted ModelNet data, the ModelNet40-C dataset can be downloaded here.
Compute the same corruptions for ShapeNetCore and ScanObjectNN, if needed.

python ./datasets/create_corrupted_dataset.py --main_path <path/to/dataset/parent/directory> --dataset <dataset_name>

Replace <dataset_name> with either scanobjectnn or shapenet as required.

Note that for computation of the corruptions "occlusion" and "lidar", model meshes are needed. These are computed with the open3d library.

Obtaining Pre-Trained Models

All our pretrained models are available at this Google-Drive.

The jt models are jointly trained for reconstruction and classification, src_only models are trained for only the classification task.

Test-Time-Training (TTT)

Setting data paths

For TTT, go to cfgs/tta/tta_<dataset_name>.yaml and set the tta_dataset_path variable to the relative path of the dataset parent directory.
E.g. if your data for ModelNet-C is in ./data/tta_datasets/modelnet-c, set the variable to ./data/tta_datasets.

A jointly trained model can be used for test-time training by:

CUDA_VISIBLE_DEVICES=0 python ttt.py --dataset_name <dataset_name> --online --grad_steps 1 --config cfgs/tta/tta_<dataset_name>.yaml --ckpts <path/to/pretrained/model>

This will run the TTT-Online (for one gradient step).

For running the TTT-Standard, following command can be used:

CUDA_VISIBLE_DEVICES=0 python ttt.py --dataset_name <dataset_name> --grad_steps 20 --config cfgs/tta/tta_<dataset_name>.yaml --ckpts <path/to/pretrained/model>

Training Models

Setting data paths

To train a new model on one of the three datasets, go to cfgs/dataset_configs/<dataset_name>.yaml and set the DATA_PATH variable in the file to the relative path of the dataset folder.

Running training scripts

After setting the paths, a model can be jointly trained by

CUDA_VISIBLE_DEVICES=0 python train.py --jt --config cfgs/pre_train/pretrain_<dataset_name>.yaml --dataset <dataset_name>

A model for a supervised only baseline can be trained by

CUDA_VISIBLE_DEVICES=0 python train.py --only_cls --config cfgs/pre_train/pretrain_<dataset_name>.yaml --dataset <dataset_name>

The trained models can then be found in the corresponding experiments subfolder.


For a basic inference baseline without adaptation, use

CUDA_VISIBLE_DEVICES=0 python test.py --dataset_name <dataset_name> --config cfgs/pre_train/pretrain_<dataset_name>.yaml  --ckpts <path/to/pretrained/model> --test_source

Scripts for pretraining, testing and test-time training can also be found in commands.sh.

To cite us:

    author    = {Mirza, M. Jehanzeb and Shin, Inkyu and Lin, Wei and Schriebl, Andreas and Sun, Kunyang and
                 Choe, Jaesung and Kozinski, Mateusz and Possegger, Horst and Kweon, In So and Yoon, Kun-Jin and Bischof, Horst},
    title     = {MATE: Masked Autoencoders are Online 3D Test-Time Learners},
    journal   = {Proceedings of the IEEE/CVF International Computer Vision Conference (ICCV)},
    year      = {2023}

We also acknowledge PointMAE for their open source implementation, which we use extensively in this project.