Biased_Double_DQL_for_flappybird

A Doulbe DQN is implemented to play flappy bird. I modified the exploration strategy so that is bird is more likely to be lazy instead of flapping its wing. As a result, by undergrone though 100K iteration, agent is capable of playing the game about 4min while the orignal implementation needs 1million iterations to play for about 10 seconds.