Dobiasd/frugally-deep

Consider having different convolution implementations available and choosing the fastest one at runtime

Dobiasd opened this issue · 1 comments

Different convolution implementations might perform differently depending on the convolution settings (input size/depth, kernel size/count) and depending on the hardware (mostly CPU/memory) used.

Right now, for example, we have a special implementation used for 2D convolutions in case strides = (1, 1) (which utilized not only by the Conv2D layer, but also by DepthwiseConv2D, and SeparableConv2D).

I wonder if it would make sense to provide a function to the user, that when called on a model, tries out different implementations and remembers which one performed best for future calls of model.predict. (Maybe in some settings, event a naive non-im2col convolution is the fastest one.)

Pros:

  • potentially faster forward passes

Cons:

  • increased code complexity
  • potentially wrong settings in case the background load on the user's machine varies too much during the evaluation

Closing this one because of the cons listed above.