farizrahman4u/qlearning4k

GPU performance question

cjmielke opened this issue · 2 comments

I noticed that training on CPU-only is much faster than on GPU. With GPU, usage is about 18% according to nvidia-smi.

Is this because it trains for each epoch, and has to send the data to the card? I see the fast-mode in the code, and it seems to be getting activated with the theano backend. I cannot figure out if Ive got something misconfigured.

Well, while gpu can compute things faster, when there is a lot of data loading/unloading cpus get the advantage. qlearning4k is still the fastest implementation of the q learning algorithm in keras.

I did eventually realize that whenever memory_size < batch_size, the speed is insanely faster but the loss goes to zero. Im going to study the code and papers more to understand what is going on here.