jzbontar/mc-cnn

Accurate network about nn.Linear and nn.SpatialConvolution1_fw

shuluoshu opened this issue · 3 comments

Hi, @jzbontar
I notice that the you use nn.Linear in training network (slow) and replace it with nn.SpatialConvolution1_fw( I guess that is the same to 1 x 1 conv) in the test process, however, I wonder why don't you just use cudnn.SpatialConvolution with kernel size equals to 1 x 1 both in training process and testing process ? Will it affect the performance or just accelerate the training speed by using 1x1 conv (with cudnn) in all process ?
Thanks a lot !

Back when I was writing the mc-cnn code nn.SpatialConvolution1_fw, which is just one matrix-matrix multiply, was much faster than cudnn.SpatialConvolution with a 1x1 kernel. Today this probably isn't the case anymore and I would have used cudnn instead. But you are right, they are the same thing.

@jzbontar Thanks for your timely reply, but what I really want to know is that why don't use 1×1 conv both in the training process and the testing process ? I mean why the nn.Linear is used in the training process ? And I wonder if I can replace the nn.Linear with 1×1 conv in the training process ? Thanks so much!

Yes, you can replace nn.Linear with 1x1 conv. I would benchmark to make sure it's not slower, though.