awslabs/mxboard

Cannot get gradient array for Parameter 'hybridsequential2_batchnorm0_running_mean' because grad_req='null'

onlyNata opened this issue · 6 comments

net=nn.HybridSequential()
with net.name_scope():
net.add(nn.Conv2D(64,kernel_size=3,strides=1,padding=1),
nn.BatchNorm(),
nn.Activation('relu'))
......
grads = [i.grad() for i in net.collect_params().values()]
assert len(grads) == len(param_names)
for i, name in enumerate(param_names):
sw.add_histogram(tag=name, values=grads[i], global_step=epoch, bins=1000)

File "F:\Anaconda3\envs\gluon\lib\site-packages\mxnet\gluon\parameter.py", line 522, in grad
"because grad_req='null'"%(self.name))

RuntimeError: Cannot get gradient array for Parameter 'hybridsequential2_batchnorm0_running_mean' because grad_req='null'

szha commented

try replacing
grads = [i.grad() for i in net.collect_params().values()]
with
grads = [i.grad() for i in net.collect_params().values() if i.grad_req != 'null']

@onlyNata Same problems. Have you solved it?

I can confirm that @szha 's solution worked for solving this error. However, I am now encountering a different error that may or may not be related. I don't want to hijack this thread so I will open another issue and link from here if I cannot troubleshoot quickly.

Thanks, @tcfkaj. But @szha 's solution gets only part of the parameters' grads yet cannot access all grads. I am confused it does not work even explicitly set model.collect_params().setattr('grad_req', 'write')

@BebDong In the case of BatchNorm, it makes sense that the _running_mean and _running_var would not be writeable and thus not trainable because their job is just to keep track of batch-level or global-level statistics. You can see in the source that they are always initialized with grad_req='null'. It appears that you cannot change grad_req from 'null' to 'write' directly for any of the parameters of BatchNorm. I am not sure how exactly this is enforced, but it makes sense for certain parameters.

@tcfkaj Thanks! It helps a lot.