Kaixhin/ACER

Doubt about gradient transfer to shared model

fedetask opened this issue · 1 comments

First of all, thanks for publishing this work! It's being really helpful to me to learn many RL concepts in practice!

In train.py the gradient transfer from the local model to the shared model is performed by:

# Transfers gradients from thread-specific model to shared model
def _transfer_grads_to_shared_model(model, shared_model):
  for param, shared_param in zip(model.parameters(), shared_model.parameters()):
    if shared_param.grad is not None:
      return
    shared_param._grad = param.grad

I have a few questions as this function looks sub-optimal to me:

  • Does the function returns if shared_param.grad is not None to avoid multiple workers to copy gradients at the same time?
  • If that is the case, it looks like the gradients are thrown away, am I correct?

If my understanding is not completely wrong, I was wondering if there is a reason behind not using a Lock mechanism when transferring the gradients. Could the function be as follows?

# Transfers gradients from thread-specific model to shared model
def _transfer_grads_to_shared_model(model, shared_model, lock):
 with lock: 
   for param, shared_param in zip(model.parameters(), shared_model.parameters()):
     shared_param._grad = param.grad

The same also may apply to the optimizer.step() call.

It's been a long time since I coded this so can't say for sure, but I think you're right on all counts. I never looked into locks, but indeed it may a better way to do this,