slightly expand description of DDPG in sec 35.3.5
murphyk opened this issue · 0 comments
murphyk commented
Currently it just says "The DDPG algorithm of [Lil+16], which stands for “deep deterministic policy gradient”, uses the DQN method (Section 35.2.6) to update Q that is represented by deep neural networks. " We expand on this a little bit.