ma-xu/pointMLP-pytorch

About Train speed and Test speed in Table 2

Roywangj opened this issue · 7 comments

Hi,

Thanks for the excellent job. I'm confusing on how to obtain the speed in Table 2. When you get the train speed, is batch_size set to 1 or 32? If 1, how do you solve the problem caused by BN?

When I use the traing/testing time of PointMLP(https://web.northeastern.edu/smilelab/xuma/pointMLP/checkpoints/fixstd/scanobjectnn/pointMLP-20220204021453/out.txt), which are 210s/22s for 11416/2882 samples, i calculate the similar result(51 samples/second for training, 131 samples/second for testing) in Table 2( 47.1 samples/second for training, 112 samples/second for testing).

While, when i use the traing/testing time of PointMLP-Elite(https://web.northeastern.edu/smilelab/xuma/pointMLP/checkpoints/fixstd/scanobjectnn/model313Elite-20220220015842-2956/out.txt), which are 43s/4s for 11416/2882 samples, I get results that are completely inconsistent result(265 samples/second for training, 720 samples/second for testing)with Table 2 ( 116 samples/second for training, 176 samples/second for testing).

ma-xu commented

@Roywangj Thanks for your interest.

  1. We are using batch size of 16 (args.batch_size // 2, args.batch_size =32). BTW, batch size 1 also works since it is in evaluation mode (by net.eval()).
  2. For the speed, please see #10 (comment). The dismatch (with provided log) is because we train our model on different servers with different GPUs (like V100-16g and V100-32g), but all testing speeds are benchamrked as described in Tab. 2. You can find the speed logs here #10 (comment).

Please let me know if you have any further questions.

You mean the testing times are 22s/14s of PointMLP/PointMLP-Elite are got by the batch size 32 in single V100? However, when i train the model, the testing speed of PointMLP-Elite seems much faster than PointMLP (about 3~5 time faster).

While the testing times of PointMLP/PointMLP-Elite(22s vs 14s) seems not corresponding the results I reproduce.

ma-xu commented

You mean the testing times are 22s/14s of PointMLP/PointMLP-Elite are got by the batch size 32 in single V100? However, when i train the model, the testing speed of PointMLP-Elite seems much faster than PointMLP (about 3~5 time faster).

Yes (testing bs is 16). I discussed with @inspirelt previously about the speed, seems different environments (even the GPUs are same) may give different speeds. He got much faster results for both PointMLP and PointMLP-elite than the results reported in paper.
You are welcome to post your detailed results and your environment in this issue.

Thanks for your reply. My doubts have been answered. For I have done experiments on different servers with different GPUs, i can't give the accurate environment. Confuse about speed (Elite version is 3~5 faster) above is intuitive,LOL. I will show the detailed results (of PointMLP/PointMLP-Elite) later .

ma-xu commented

Thanks for your reply. My doubts have been answered. For I have done experiments on different servers with different GPUs, i can't give the accurate environment. Confuse about speed (Elite version is 3~5 faster) above is intuitive,LOL. I will show the detailed results (of PointMLP/PointMLP-Elite) later .

Thanks. Hope above discussions will help you.