Deep-Reinforcement-Learning Goal In this competition, you are going to train a RL agent to play flappy bird using policy gradient.