Double-DQN-MxNet-Gluon

DDQN.ipynb is the implimentation of DDQN.

DDQN-efficent-ReplayBuffer.ipynb is the same code with 1/8 more efficient cpu memory usage where instead of storing current_states + secessur_state at each times step, we just store the current frame.