leeyeehoo/CSRNet-pytorch

High inference time in comparison to keras implementation

Opened this issue · 1 comments

As stated by the issue title. The CSRNet-keras implementation is a lot faster than this implementation.
In my case, one single inference on a Full HD image required ~6100 ms. On the other hand using the keras implementation it required ~460ms

Is there any way to make this implementation faster?

My machine:

  • PyTorch: 1.0.1
  • GTX 1060 Drivers: 410.104
  • CUDA 10
  • Intel(R) Core(TM) i5-7500T CPU @ 2.70GHz

As I noted, the dilation_rate CNN (after custom VGG16) in keras implementation does not use BatchNormalization

            #Conv2D
            model.add(Conv2D(512, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(512, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(512, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(256, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(128, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(64, (3, 3), activation='relu', dilation_rate = 2, kernel_initializer = init, padding = 'same'))
            model.add(Conv2D(1, (1, 1), activation='relu', dilation_rate = 1, kernel_initializer = init, padding = 'same'))
 

So, I think that is why it calculate faster pytorch implementation