simoninithomas/Deep_reinforcement_learning_Course

Deep Q learning: Algorithm

wailker3 opened this issue · 0 comments

The author only used one CNN to calculate the evaluate Q value and the target Q valuate. Shouldn't there be another CNN to calculate the target Q valuate?