slow training speed in latest code?

Question

slow training speed in latest code?

Opened this issue 7 years ago · 9 comments

I just tried the latest code, and found the training speed slowed down significantly, it used to be more than >200 steps_per_second, but right now it's ~100 steps_per_second

2017-09-24 15:08:08,844 Epoch #169
2017-09-24 15:08:08,844 Training for 250000 steps
2017-09-24 15:43:51,299 num_games: 1101, average_reward: 24.793824, min_game_reward: 0, max_game_reward: 400
2017-09-24 15:43:51,299 last_exploration_rate: 0.100000, epoch_time: 2143s, steps_per_second: 116
2017-09-24 15:43:51,300 Saving weights to snapshots/breakout_169.prm
2017-09-24 15:43:51,325 Testing for 125000 steps
2017-09-24 15:54:51,175 num_games: 70, average_reward: 240.414286, min_game_reward: 31, max_game_reward: 426
2017-09-24 15:54:51,175 last_exploration_rate: 0.050000, epoch_time: 660s, steps_per_second: 189

I wonder what recent change may caused the downgrade.

Answer 1 · 2017-09-25T06:04:07.000Z

Unfortunately I'm not actively working on this codebase any more. But I would be happy to accept PR. Leaving this open till then.

Answer 2 · 2017-10-19T19:15:55.000Z

Hello! What is the usual training speed on CPU?
I have only 7 steps_per_second (14 step_per_second with MKL) and it is so slow. Is it possible to improve the CPU performance for this algorithm?

Answer 3 · 2017-10-19T19:55:36.000Z

Unfortunately no, you definitely need GPU for training.

Answer 4 · 2017-10-19T20:03:55.000Z

@tambetm, is it the issue of the Q-learning algorithm itself or of the current implementation? I mean, do you have an idea whether I could reach higher performance with Q-learning on CPU using other tools and frameworks?

Answer 5 · 2017-10-19T20:15:34.000Z

All reinforcement learning algorithms are compute intensive. Asynchronous Advantage Actor-Critic (A3C) is the one that can be more easily parallelized to use multiple CPUs. Search "a3c github" for some example implementations.

Answer 6 · 2017-11-24T19:19:09.000Z

How do you find the steps per second, i am also running on a CPU and just get this output that hangs:

./train.sh Breakout-v0 --environment gym
No handlers could be found for logger "gym.envs.registration"

But top shows its processing away

Answer 7 · 2017-11-25T18:12:24.000Z

it's on GPU.

Answer 8 · 2017-11-26T22:59:50.000Z

So it's not possible to run at all on cpu?

Answer 9 · 2017-11-27T07:37:01.000Z

Your error refers to logging and shouldn't stop it from proceeding. The training is just slow, so it might take a while to print out training statistics. I you can run pre-trained Pong and Breakout models against the game without GPU.