why deep learning is used in this game
gaurav2695 opened this issue · 4 comments
As i found on the internet that this game can be built without the use of deep learning [https://github.com/chncyhn/flappybird-qlearning-bot]
So can u help me understand what is more beneficial in using deep learning in this game rather than simply using q-learning.
@gaurav2695 because in raw q-learning, you should manually define the state value function, which is difficult to come up with a fairly good one. In the example you referred, the author redefined the state to obtain a better value function. We name these sophisticated rules that identify a state is good or bad as features in approximate q-learning, and you should also design features yourself.
However,the story is different in deep q-learning network. All you have to do is feeding the raw pixels of a frame into the neural network, the features that define a state is good or bad will be learned automatically.
@ColdCodeCool Well, actually, you should much rather define what is good and what's bad with NN too.
I'd instead just say that these approaches are both valid and there's also nothing about NNs to prevent you from using THEM to learn to play the game not from pixels, but from quantified features like in https://github.com/chncyhn/flappybird-qlearning-bot . And, of course, quantified approaches are going to work better. It's just that you also have to parse the scene. And a lot of the time this is going to be okay for your application and not too hard for you to make, I'm guessing, and it may work much better depending on your data/environment. Sometimes, of course, like in cases with robot visual navigation and stuff, the best way to solve something does involve convolutionary thinking to make it right, sometimes, like in case if you're actually making the best possible bot for Flappy, I think, you indeed are much better off first getting a good understanding of the game, considering features, parsing scenes, defining proper update rules. That way you can end up with a simple tiny superfast and superaccurate bot. And of course, you don't have to do some Q Learning for that, you can do that with the same backpropagating neural networks.
One algorithm where you probably actually don't need a lot of judgemental support to help the network is if you're using Evolutionary Strategies, but well..
Having a good understanding of the game is always critical if you want to make a performant bot, regardless of your choice of approximation algorithm. For example, @yenchenlin clearly missed the point that if you're doing this kind of CNN solving, you definitely can't use sticks and carrots, only carrots. His approach can easily fall into simply the bird going up all of the time, he obviously needed to tune for that not to happen. I've made a similar implementation of convolutionary NN learning for Flappy to yenchenlin's but I've thought a bit more about what the data and the game actually are and I've got a top score of >1000 so far and that can be trained in a couple hours when Kevin Chen seems to have gotten a bit over 200 tops https://pdfs.semanticscholar.org/b56c/7703337cb9db008422b9b3410c97fff8bb54.pdf and I'm guessing that this repo's network is many many times slower and larger than mine which is <1.5MBs in size without using such huge kernels. And https://github.com/chncyhn/flappybird-qlearning-bot this Q Learning guy you linked had a way better score than I had so far. Though, we may have slightly different versions of the game - I forked https://github.com/shalabhsingh/A3C_Keras_FlappyBird , but I'm guessing that mine is just a cut-down version with less graphics. The pipe sizes and difficulty seems the same.
Posted my immortal bot https://github.com/ibmua/flappy/
The whole point is that you don't want to design specific algorithm or hack for a single game.