How long should I train in CPU?

Question

How long should I train in CPU?

guotong1988 opened this issue 8 years ago · 1 comments

guotong1988 commented 8 years ago

global step 2200 learning rate 0.5000 step-time 2.00 perplexity 37.43
  eval: bucket 0 perplexity 20.10
  eval: bucket 1 perplexity 33.75
  eval: bucket 2 perplexity 34.75
  eval: bucket 3 perplexity 43.44

result:

how old are you ?
i ' m not .

Answer 1 · 2016-09-10T11:10:03.000Z

@guotong1988, I would not even try training seq2seq models on a CPU, since a simple analysis shows that a typical GPU yields 40x-80x speed up comparing to a CPU. AWS has some affordable options, you can check them up here: https://aws.amazon.com/ec2/pricing/

In case your only option is still CPU, I would recommend using very modest parameters for your model, i.e. equal or lower than the following ones:

1 lstm layer x 512 neurons
w2v embeddings dimensionality = 128
max length of input and output sequences = 16 (i.e. bucket sizes should not exceed this value)
max vocabulary size = 20000
batch size - as big as possible to fit in the RAM of your machine

This should significantly increase your chances of generating some meaningful answers on CPU in a realistic period of time. However, my main message remains the same - deep learning tastes much better when served with good GPUs.