MohamedAfham/CrossPoint

Can train_crosspoint.py train the partseg model based on ShapeNetPart?

Closed this issue · 2 comments

@MohamedAfham Thank you for releasing the code. The paper is well written and the code is robust.

I have successfully trained the classification and part segmentation models based on train_crosspoint.py and train_partseg.py, respectively. Everything goes smoothly.

One point I'm confused with is the comments in scripts/script.sh, you point out train_crosspoint.py can be used for training the part segmentation model and train_partseg.py is used for finetuing it. The code in train_crosspoint.py, however, only load ShapeNetRender for pretraining and ModelNet40 for linear accuracy evaluation. Actually, it does not load ShapeNetPart for part segmentation.

Instead, I think both training and finetuning take place in train_partseg.py as the train_loader in this file is designed for ShapeNetPart. Further, I think the self-superviesd cross-modal contrastive learning is intended for point cloud classification. Have I got a correct understaning?

Thanks for raising the issue @auniquesun.

In the CrossPoint approach, pre-training is happening in ShapeNetRender dataset regardless of the downstream task. We initially pre-train using train_crosspoint.py making sure that the point cloud feature extractor is rich in understanding point cloud feature representations.

However, when it comes to downstream tasks, where we have provided point cloud classification and part segmentation, we take on the following two paths:

  1. For point cloud classification, we directly take the point cloud feature extractor trained using train_crosspoint.py and do the linear evaluation as in eval_ssl.ipynb.

  2. However, a similar approach cannot be applicable for part segmentation. The reason is, for part segmentation we need to have a slight architectural modification that the model should return part-level classification scores for each point in the point cloud. So, we append a simple decoding structure on top of the point cloud feature extractor trained using train_crosspoint.py. Since the decoder is been initialised randomly, we need to tune its weights. For that we use train_partseg.py to tune the whole end-to-end part segmentation architecture. To the best of our knowledge, it's the conventional procedure in the existing unsupervised point cloud representation learning approaches we have compared with.

I hope that answered your confusion.

I got it, thank you.