Opened this issue 3 years ago · 0 comments
deep_reinforcement_learning_gallery/R2D2/atari/r2d2_main.py
Line 163 in 8a9e329
deep_reinforcement_learning_gallery/R2D2/atari/actor.py
next_maxQへの割引率はgamma ** n_stepが正しい
next_maxQ
gamma ** n_step