OpenGVLab/LAMM

EPCL pertained model details

ZCMax opened this issue · 4 comments

ZCMax commented

Thanks for your code, I wonder how the EPCL pertained model is obtained? For example, training datasets and training approach? Since the name of checkpoint includes scannet, was it trained on ScanNet datasets?

Hi, thanks for your issue.

The EPCL checkpoint we used is the methodology from FrozenCLIP. Since the 3D datsets are limited, we follow the setting in the paper and choose the pretrained checkpoint on ScanNet, which is trained for 3D detection task.

ZCMax commented

Thanks for your reply, my next question is that since the pertained checkpoint is trained for 3D detection task on ScanNet, whether the 3D benchmark on ScanNet can still be regarded as zero-shot manner?

Thanks. This method is limited by the existing pretrained encoder in 3D vision. Compared with 2D, the EPCL encoder indeed used the scannet data to pretrain. But in LAMM framework, ScanNet data is not exposured to LLM decoder, which is the major part of the framework.

Later, we will try to test with other point cloud encoder, contributions are also welcomed.

This issue will be closed for no further discussions. Please reopen it if necessary.