awjuliani/DeepRL-Agents

Double-Dueling-DQN: question about the rate to update target network

oneQuery opened this issue · 0 comments

I've encountered the thing that I can't understand while following up the Double-Dueling-DQN.ipynb.

There's a def like below

def updateTargetGraph(tfVars,tau):
    total_vars = len(tfVars)
    op_holder = []
    for idx, var in enumerate(tfVars[0:total_vars//2]):
        op_holder.append(tfVars[idx+total_vars//2].assign((var.value()*tau) + ((1-tau)*tfVars[idx+total_vars//2].value())))
    return op_holder

What does the op_holder mean and its role?

I skimmed the paper of Double DQN and Dueling DQN again, but I could not find out about the 'rate to update target network', which is indicated as 'tau' in this code.