Issues about channel of xyz in example.py

Question

Issues about channel of xyz in example.py

Holmes-GU opened this issue a year ago · 14 comments

Hi, thanks for your codes. I have tried to run python src/examples. However, it returns 'RuntimeError: Given groups=1, weight of size [64, 9, 1, 1], expected input[1, 10, 64, 384] to have 9 channels, but got 10 channels instead', deriving from self.mlp of function 'PointNetSetAbstraction' in modes/pointnet_util.py. The source maybe lie in the channel of xyz (4 in example.py). Is there any solutions?

Thank you very much.

Colin97 commented a year ago

Yes.

Answer 1 · 2023-07-12T09:46:29.000Z

Hi. I am a little confused why you are seeing 4 channels for xyz.

Mine has 3.

Answer 2 · 2023-07-12T10:59:34.000Z

Hi. I am a little confused why you are seeing 4 channels for xyz.

Mine has 3.

Hi, I follow 'python3 src/example.py' for quick start. In example.py, the xyz is processed by ME.utils.batched_coordinates(), therefore, the channel becomes 4, as shown below. Besides, 'demo/pc.ply' does not exist and it should be 'demo/owl.ply' instead. And in the training file, this function seems not to be used.

Answer 3 · 2023-07-12T11:42:20.000Z

Hi, which checkpoint are you using? The example.py is for spconv. If you are using a pointbert checkpoint, some modifications are needed. Sorry for the confusion.

Answer 4 · 2023-07-12T11:48:05.000Z

Hi, which checkpoint are you using? The example.py is for spconv. If you are using a pointbert checkpoint, some modifications are needed. Sorry for the confusion.

I have changed the backbone to pointbert with model.scaling=4, model.name=PointBert and model.use_dense=True. Would you like to provide me with exact modifications? Thanks.

Answer 5 · 2023-07-14T06:23:00.000Z

Hi, please refer to the codes:

https://github.com/Colin97/OpenShape_code/blob/26cf8d16551368f8f1e8e3801cbfb629b6157a03/src/train.py#L101C18-L101C18

https://github.com/Colin97/OpenShape_code/blob/26cf8d16551368f8f1e8e3801cbfb629b6157a03/src/data.py#L238C10-L238C19

Basically, to use PointBert, you don't need to process the PC with MinkowskiEngine.

Answer 6 · 2023-07-14T14:49:00.000Z

Hi, please refer to the codes:

https://github.com/Colin97/OpenShape_code/blob/26cf8d16551368f8f1e8e3801cbfb629b6157a03/src/train.py#L101C18-L101C18

https://github.com/Colin97/OpenShape_code/blob/26cf8d16551368f8f1e8e3801cbfb629b6157a03/src/data.py#L238C10-L238C19

Basically, to use PointBert, you don't need to process the PC with MinkowskiEngine.

Ok. Thanks for your instructions. I will try it tomorrow~

Answer 7 · 2023-07-16T05:47:09.000Z

Hi. Following your instructions, I successfully run the codes.

Sorry for bothering you again. I observe some differences in processing data in training and testing as follows:

In training, the rgb is multiplied by a factor 0.4 in 'if use_color' from 'class Four'. Why not apply it in testing?
In class ModelNet40Test, 'rgb = rgb / 255.0' is applied before 'if use_color' but not applied and observed in 'class ObjaverseLVIS' and 'class ScanObjectNNTest'.

Answer 8 · 2023-07-16T06:29:30.000Z

Do you mean this line? This is an augmentation, which randomly change the colors of some shapes to a constant (0.4).
This is due to some inconsistency when preparing the data files. RGB in ObjaverseLVIS and ScanObjectNN are in [0,1]. ModelNet40 don't have colors and we put 100 for all data files. 'rgb = rgb / 255.0' is just to normalize them to 0.4.

Answer 9 · 2023-07-16T06:55:31.000Z

OK. What about this part? Does ScanObjectNNTest not own colors and directly set it as a constant (0.4)?

Answer 10 · 2023-07-16T07:23:22.000Z

Yes.

OK, thank you very much.

Answer 11 · 2023-07-20T08:47:25.000Z

哈喽，再问一下哦，这个地方为啥只取x[:,0],而不是x呢？

Answer 12 · 2023-07-20T09:07:25.000Z

哈喽，再问一下哦，这个地方为啥只取x[:,0],而不是x呢？
这个x[:,0]是class token吧，然后剩下的384是聚合以后点的个数吧？

Answer 13 · 2023-07-20T12:43:48.000Z

Because we pick the first token, which is the CLS token, as the pooler output, as in any transformer encoder architecture (including but not limited to BERT, CLIP ViT, PointBERT).