Cannot reproduce the results reported in the Paper (CD=2.723)

You need to train the whole network with Chamfer Distance. It reaches CD ~0.40 on ShapeNet.
Then, you need to fine-tune the network with Gridding Loss + Chamfer Distance on the Coarse Point Cloud.
Finally, you fine-tune the network with Chamfer Distance. Chamfer Distance is taken as a metric, therefore, you cannot get lower CD without using Chamfer Distance as a loss.

Originally posted by @hzxie in #3 (comment)

So, what's the problem you are facing now?
Please provide more details.

Hi author, thanks for the amazing work.

On your released pre-trained model, I can get 0.7082 F-score, 2.722 CD.
However, when I train from scratch, I had some problems listed below:

"You need to train the whole network with Chamfer Distance." --- It reaches 4.588 CD, 0.6133 F-score, which is similar with Table 7&Not Used&CD&Complete = 4.460 in your paper.

"Then .. fine-tune the network with Gridding Loss + Chamfer Distance on the Coarse Point Cloud." ---- It reaches 4.536 CD, 0.6255 F-score. It was supposed to be about ~2.7, right?

sparse_loss = chamfer_dist(sparse_ptcloud, data['gtcloud'])
dense_loss = chamfer_dist(dense_ptcloud, data['gtcloud'])
grid_loss = gridding_loss(sparse_ptcloud, data['gtcloud'])
_loss = sparse_loss + dense_loss + grid_loss

__C.NETWORK.GRIDDING_LOSS_SCALES = [128]
__C.NETWORK.GRIDDING_LOSS_ALPHAS = [0.1]

"Finally, you fine-tune the network with Chamfer Distance." --- the CD didn't decrease below 4.536.

I'm wondering what steps am I making mistakes? (like learning rate/loss weight of gridding loss)

your processed ShapeNet dataset has 28974 training data samples
while the PCN dataset has 231792 training data samples

is it because your provided dataset is not completed?

@AlphaPav
Sorry for the late reply. I don't have time to check this issue these days.
But I'm sure that there is nothing wrong with the released dataset. 231792 / 28974 = 8, which indicates that there are 8 partial input point cloud for each model in ShapeNet.

@AlphaPav
Sorry for the late reply. I don't have time to check this issue these days.
But I'm sure that there is nothing wrong with the released dataset. 231792 / 28974 = 8, which indicates that there are 8 partial input point cloud for each model in ShapeNet.

The PCN dataset is about 48 GB, while the released dataset is about 10 GB. Do you mean that you randomly augment each point cloud 8 times during training?

No. I think the difference may be caused by different compression ratios.
You can also generate the ShapeNet dataset from PCN with this script.

Hi! I also cannot reproduce the results. The highest CD I got after training three times was 5.2. May I know how many epochs you've trained for each round respectively? (i.e. CD only, CD + gridding loss, CD only)

@SarahChane98
I cannot report the exact numbers of epochs for each round.
For each round, I train several times until the loss does not decrease.
Try to fine-tune the network again with the previous weights (from last training).

Hi there, I just tested your pretrained model on test set, and the result is close to the value reported in paper. However, when I tested on validation dataset, it reported a dense CD around 7.177. I was wondering why there is a hugh gap between CDs on val set and test set?

and a dense cd around 5.087 for training set reported with pretrained model (should be the same as training dense loss if i understand correctly)

@paulwong16
If the reported results are correct, one possible reason why the pretrained model performs worse in the validation and training set is that we choose the best model for the test set instead of the validation set and training set.

@paulwong16

Because we choose the best model for the test set instead of the validation set.

but why CD on test set could be even much lower than on training set?

@paulwong16
Because the pretrained model is best for fitting distribution of the testing set.
Instead, the distribution of the training and validation set may be different from the testing set.

@paulwong16

Because the pretrained model is best for fitting distribution of the testing set.

Instead, the distribution of the training and validation set may be different from the testing set.

well...i believe the best model should not be chosen according to the test result (instead, should be the validation result). And from the best results I could reproduce, the training loss was a little lower than val loss and test loss, and test loss was close to the val loss.

Anyway, thanks for your kind reply, I will try to reproduce the result.

@paulwong16
Yes, choosing models from the testing set is not a good option.
For the Completion 3D benchmark, the best model is chosen from the validation set. (Because we don't have the ground truth for the testing set.)

@hzxie Hi I'm wondering how you incoorporate gridding_loss in training? I have not found it in the script Thanks

@wangyida

You can use the Gridding Loss here:

GRNet/core/train.py

Line 113 in 3352592

sparse_loss = chamfer_dist(sparse_ptcloud, data['gtcloud'])

when fine-tuning the network.

@hzxie Thank you, I tried it out and the result seems to be fitting with the expected trends. Thanks for your inspiring work;)

Hi, I'm wondering how to fine-tune the network with the previous weight? I've tried the same configuration as your paper but the best model gets CD=4.538 and F-Score=6.206 while your pre-trained model can get CD=2.723 and F-Score=7.082.

And I check the log and find that the network had converged to the optimal in 20 epochs. Why you set 150 epoch as the default?

@Lillian9707

In my experiments, the loss will continue to decrease after 20 epochs.
Moreover, you need to fine-tune the network with the Gridding Loss.

Hi, I still cannot reproduce the result. Can you provide more details?

I've tried to fine-tune the framework with gridding loss and lower learning rate. But the CD score and F-score got worse.

@Lillian9707
Keep the learning rate unchanged during fine-tuning.
According to the experimental results of AlphaPav, the CD and F-Score got better after applying Gridding Loss.

Thank you for your reply!
But AlphaPav only gets '4.536 CD, 0.6255 F-score' after fine-tuning, which looks more stochastic.
So the fine-tuning process is to train with 1* CD on both Sparse and Dense Point Cloud + 1* Gridding Loss on Sparse Point Cloud? And the learning rate is always 5e-5?

hi, sorry to bother you. I still cannot reproduce the results in the paper.
I have tried several times to fine-tune the network, including use lr=5e-5, 1e-5, 1e-6 and Multi-stepLR, training with CD + Gridding loss on sparse or dense cloud, and so on. But the results are always around CD=4.5 and F-Score=6.2. Can you provide more details about fine-tuning?

@Lillian9707

Try to fine-tune the network w/ and w/o Gridding Loss several times.
During fine-tuning, try to use top-10 (not always the best) weights from the previous training should be loaded.
The init learning rate for fine-tuning should be 1e-4.