mIoU for s3dis dataset using Area 5 as test set

Hi, thank you for your contribution, especially because I think there is no official code for point-transformers. I see that you report the overall accuracy and compare it with the original paper for the s3dis dataset. Did you get by any chance the mIoU using Area 5 as a test?

Thank you

Hi, I got the following metrics:

Model	mAcc	OA	mIoU
Paper	76.5	90.8	70.4
POSTECH Implemention	63.8	88.3	56.2

I'm curious what other people are getting as there is a big gap between the reported mAcc and mIoU and the metrics from the POSTECH implementation.

Here's the gist with code I created for this validation experiment.

Hi, I also find there is a big gap between the reported mIoU and implementation, and I'm wondering if the test is standard, as the testloader loads data from 'indoor3d_sem_seg_hdf5_data', which is sampled data from original s3dis data 'Stanford3dDataset_1.2_Aligned_Version' using block_size=1.0 and stride=0.5. And it seems that landmark works 'KPConv' and 'RandLA-Net' all use another dataset processing pipeline, and I think the results reported in the paper perhaps use the latter settings.

Hi, sorry for the late reply.

We are working on the new codebase with PAConv repo.
PAConv repo provides well-organized scripts and efficient k-nearest neighbor search which Point Transformer also uses.
Currently, semantic segmentation on S3DIS is implemented in "paconv-codebase" branch.

The newly-implemented Point Transformer is on training now [Epoch: 25/100].
This is the performance of the model trained with epoch 25:

Model	mAcc	OA	mIoU
Paper	76.5	90.8	70.4
POSTECH Implemention (epoch: 25/100)	68.9	88.2	62.4

The performance tables will be updated as soon as the training is finished.

Regards,

Chunghyun

Hi Chunghyun,

Thank you for your participation in the discussion. I'm guessing the training should be done by now, so I am very curious about the final performance of the model. Would you mind sharing them?

Hi, @bramton
Sorry for the late update.
This is the performance of Point Transformer which our team has re-implemented.

Model	mAcc	OA	mIoU
Paper	76.5	90.8	70.4
POSTECH Implemention	71.9	89.0	65.6

However, hengshuang who is the first author of Point Transformer has provided the official code to us.
After resolving minor bugs and polishing the codebase, we could reproduce the numbers in the paper.

Model	mAcc	OA	mIoU
Paper	76.5	90.8	70.4
Hengshuang's code	76.8	90.4	70.0

As hengshuang agreed with releasing the code, the hengshuang's code will be released soon in this repository (It will take 1~2 days).

Regards,

Chunghyun

Hi,

We just have updated the master branch.
You can reproduce the above result with it.

FYI, the codebase from the first author only supports S3DIS dataset. Therefore, please use paconv-codebase branch for shape classification and part segmentation. After reproducing results with the paconv-codebase, we will merge it into the master branch.

Regards,

Chunghyun

Hi, thanks for providing this nice code. I go though the code and find that the testing result of S3DIS has been done on the downsampled points with voxel_size == 0.04. Can you provide the results on the original point cloud? It is not fair to report this result on the downsampled dataset. And in my experiments, I found that almost all algorithms will achieve a higher mIoU on the downsampled dataset(perhaps several percents or even more).

@densechen Hi. Have you trained a model with the official code by yourself? I am curious about the results on the original point cloud.
Best.

@yuxumin Not yet. I am waiting for the authors to release this code, too.

@densechen, Does 'this code' mean the official code from hengshuang? I notice that there is a branch named hengshuang-codebase, is it what you are waiting for?

@yuxumin Yes. The hengshuang~codebase. But it seems that it is not easy to reproduce the result even using this code.
You can refer to here.

Thanks a lot. I will try to run it once using this code and update the results here.
Have a nice day.

@densechen I am confused, what is the code available on henshuang-codebase ? the original code from the first author or the adaptation of it from the owner of this repo ?

@yuxumin I hope to hear from you soon ! as I currently can't test it myself because of a bug on my side

@densechen @QuanticDisaster Hi, i finish the training and test. I got best mIOU 0.698 on Area 5 in the validation in training script (point-based data? ). And I get 0.7108 and 0.7006 mIOU using the test code of this repo ( voxelized data input). log file can be find here. If you guys want the pretrained weight, i think i can upload it to google drive, too.

I run the code on 4 RTX 3090 with the default config in this repo.

@yuxumin Awesome ! The log are very detailed, thank you for providing them. I was trying to implement it in Pytorch geometric so I will have a look at the architectures and dimension, but I believe people would be interested in the pretrained weights too

@QuanticDisaster Yeah, i think so. I will provide pretrained weight here.

Hi, thanks for providing this nice code. I go though the code and find that the testing result of S3DIS has been done on the downsampled points with voxel_size == 0.04. Can you provide the results on the original point cloud? It is not fair to report this result on the downsampled dataset. And in my experiments, I found that almost all algorithms will achieve a higher mIoU on the downsampled dataset(perhaps several percents or even more).

Hi @densechen,
Sorry for the late reply.
I think that you misunderstood the test code in this repository.
The test evaluation (mIoU) is calculated on the original input point cloud not the voxel-downsampled one.
You can find that the prediction has the same shape as the ground truth in the code below.

point-transformer/tool/test.py

Line 136 in 283afc4

pred = torch.zeros((label.size, args.classes)).cuda()

The voxel downsampling (or grid subsampling) is only used for the forward process.
I don't know the exact reason why the authors grid-subsample an input point cloud.
I think that grid-subsampling is used to make the density of the input point cloud uniform and kNN sampling consistent.

FYI, you can find the point-level predictions in 'exp' directory if you successfully run the test script.

@yuxumin Hi, I wanted to know if you had the weights available ?

@QuanticDisaster Sorry for forgetting to update the link here. I update the pretrained weight to google driver with this link

@yuxumin Thanks a lot !

Hi, thanks for providing this nice code. I go though the code and find that the testing result of S3DIS has been done on the downsampled points with voxel_size == 0.04. Can you provide the results on the original point cloud? It is not fair to report this result on the downsampled dataset. And in my experiments, I found that almost all algorithms will achieve a higher mIoU on the downsampled dataset(perhaps several percents or even more).

Hi @densechen, Sorry for the late reply. I think that you misunderstood the test code in this repository. The test evaluation (mIoU) is calculated on the original input point cloud not the voxel-downsampled one. You can find that the prediction has the same shape as the ground truth in the code below.

point-transformer/tool/test.py

Line 136 in 283afc4

pred = torch.zeros((label.size, args.classes)).cuda()

The voxel downsampling (or grid subsampling) is only used for the forward process. I don't know the exact reason why the authors grid-subsample an input point cloud. I think that grid-subsampling is used to make the density of the input point cloud uniform and kNN sampling consistent.

FYI, you can find the point-level predictions in 'exp' directory if you successfully run the test script.

Not to necro an old thread but I'm not following you here. The previous line coord, feat, label, idx_data = data_load(item) leads to this block of code which definitely does downsample the test pointcloud unless you remember to set voxel_size to None (which the default config doesn't because the same config file is used for both train and test). It's not an issue but people should be aware:
`
...

if args.voxel_size:
coord_min = np.min(coord, 0)
coord -= coord_min
idx_sort, count = voxelize(coord, args.voxel_size, mode=1)
for i in range(count.max()):
idx_select = np.cumsum(np.insert(count, 0, 0)[0:-1]) + i % count
idx_part = idx_sort[idx_select]
idx_data.append(idx_part)
else:
idx_data.append(np.arange(label.shape[0]))

...
`

Not only that but we can easily show that when there IS voxelisation then it always produces duplicates which will skew your test results?