google/qkeras

Slow training

schmiph2 opened this issue · 2 comments

Hi everyone

Is it expected behavior that the quantization-aware training in QKeras is much slower than normal training in Keras? And if so, out of interest, where does the overhead come from? From the quantization-dequantization operation?

Thank you for your help!

The quantization operations will add a little more computation for every inference, but it should not be significant. Most of the slowness in training is expected to come from the increased number of epochs you may need to train for quantized models, especially if going to low precisions and the training becomes unstable.

What sorts of slow-downs are you experiencing? Do you have any examples / data?

Hi Daniele

Thank you for your response. I have prepared a Colab notebook for a similar setup as I intended to work on (mapping time-sequence X to Y). With the Keras implementation the training takes 0.22 s and with Qkeras 6 s per step. If I remove the GRU, there is still a difference, but it is not as big as before (50 ms vs 90 ms per step). I assume that the main difference for the slower training of the DNN with GRU comes from a non-cuDNN optimized implementation of GRU (e.g., because of quantized activiations). On the Tensorflow description of the GRU layer, they also mention, that the fast cuDNN GRU is only used for the standard configuration.

It seems that slow training is not a problem of QKeras but of non-standard GRUs in general.