Accurate network about nn.Linear and nn.SpatialConvolution1_fw

Question

Accurate network about nn.Linear and nn.SpatialConvolution1_fw

shuluoshu opened this issue 8 years ago · 3 comments

Hi, @jzbontar
I notice that the you use nn.Linear in training network (slow) and replace it with nn.SpatialConvolution1_fw( I guess that is the same to 1 x 1 conv) in the test process, however, I wonder why don't you just use cudnn.SpatialConvolution with kernel size equals to 1 x 1 both in training process and testing process ? Will it affect the performance or just accelerate the training speed by using 1x1 conv (with cudnn) in all process ?
Thanks a lot !

Answer 1 · 2017-02-24T13:20:59.000Z

Back when I was writing the mc-cnn code nn.SpatialConvolution1_fw, which is just one matrix-matrix multiply, was much faster than cudnn.SpatialConvolution with a 1x1 kernel. Today this probably isn't the case anymore and I would have used cudnn instead. But you are right, they are the same thing.

Answer 2 · 2017-02-25T07:52:33.000Z

@jzbontar Thanks for your timely reply, but what I really want to know is that why don't use 1×1 conv both in the training process and the testing process ？ I mean why the nn.Linear is used in the training process ？ And I wonder if I can replace the nn.Linear with 1×1 conv in the training process ？ Thanks so much！

Answer 3 · 2017-02-27T13:37:09.000Z

Yes, you can replace nn.Linear with 1x1 conv. I would benchmark to make sure it's not slower, though.