szq0214/DSOD

The result after relu activation function isn't used in grp-dsod.

Opened this issue · 1 comments

Thx for your sharing code of grp-dsod.I read the code and I find that result after relu function isn't used in this part.

def global_level(net, from_layer, relu_name):
    fc = L.InnerProduct(net[relu_name], num_output=1)
    sigmoid = L.Sigmoid(fc, in_place=True)
    att_name = "{}_att".format(from_layer)
    sigmoid = L.Reshape(sigmoid, reshape_param=dict(shape=dict(dim=[-1])))
    scale = L.Scale(net[att_name], sigmoid, axis=0, bias_term=False, bias_filler=dict(value=0))
    relu = L.ReLU(scale, in_place=True)
    residual = L.Eltwise(net[from_layer], scale)
    gatt_name = "{}_gate".format(from_layer)
    net[gatt_name] = residual
    return net

relu = L.ReLU(scale, in_place=True)
Is it a mistake?Or,is it discarded?

Hi @zjuzuhe, thanks for pointing out this, actually, we did not use relu following the scale operation in global-level attention. I will remove or comment this line soon. I'm also not sure if it is helpful or not using relu here.