tambetm/simple_dqn

slow training speed in latest code?

Opened this issue · 9 comments

mw66 commented

I just tried the latest code, and found the training speed slowed down significantly, it used to be more than >200 steps_per_second, but right now it's ~100 steps_per_second

2017-09-24 15:08:08,844 Epoch #169
2017-09-24 15:08:08,844 Training for 250000 steps
2017-09-24 15:43:51,299 num_games: 1101, average_reward: 24.793824, min_game_reward: 0, max_game_reward: 400
2017-09-24 15:43:51,299 last_exploration_rate: 0.100000, epoch_time: 2143s, steps_per_second: 116
2017-09-24 15:43:51,300 Saving weights to snapshots/breakout_169.prm
2017-09-24 15:43:51,325 Testing for 125000 steps
2017-09-24 15:54:51,175 num_games: 70, average_reward: 240.414286, min_game_reward: 31, max_game_reward: 426
2017-09-24 15:54:51,175 last_exploration_rate: 0.050000, epoch_time: 660s, steps_per_second: 189

I wonder what recent change may caused the downgrade.

Unfortunately I'm not actively working on this codebase any more. But I would be happy to accept PR. Leaving this open till then.

Hello! What is the usual training speed on CPU?
I have only 7 steps_per_second (14 step_per_second with MKL) and it is so slow. Is it possible to improve the CPU performance for this algorithm?

Unfortunately no, you definitely need GPU for training.

@tambetm, is it the issue of the Q-learning algorithm itself or of the current implementation? I mean, do you have an idea whether I could reach higher performance with Q-learning on CPU using other tools and frameworks?

All reinforcement learning algorithms are compute intensive. Asynchronous Advantage Actor-Critic (A3C) is the one that can be more easily parallelized to use multiple CPUs. Search "a3c github" for some example implementations.

How do you find the steps per second, i am also running on a CPU and just get this output that hangs:

./train.sh Breakout-v0 --environment gym
No handlers could be found for logger "gym.envs.registration"

But top shows its processing away

mw66 commented

it's on GPU.

So it's not possible to run at all on cpu?

Your error refers to logging and shouldn't stop it from proceeding. The training is just slow, so it might take a while to print out training statistics. I you can run pre-trained Pong and Breakout models against the game without GPU.