Get feature map filled with zeros in FilterNorm

Hi, in build_channel_branch function, FilterNorm with filter_type='channel' is a subsequent module after AdaptiveAvgPool2d, which is expected to get a N*C*1*1 feature map as input. But in FilterNorm, you calculate mean in the 2nd dim, which will return the feature map itself because this dim is a singleton dim.

ddfnet/ddf/ddf.py

Lines 158 to 170 in b3be7fa

    
           elif self.filter_type == 'channel': 
        
               b = x.size(0) 
        
               c = self.in_channels 
        
               x = x.reshape(b, c, -1) 
        
               x = x - x.mean(dim=2).reshape(b, c, 1) 
        
               x = x / (x.std(dim=2).reshape(b, c, 1) + 1e-10) 
        
               x = x.reshape(b, -1) 
        
               if self.runing_std: 
        
                   x = x * self.std[None, :] 
        
               else: 
        
                   x = x * self.std 
        
               if self.runing_mean: 
        
                   x = x + self.mean[None, :]

So line 162 will always return a zero feature map, and line 163 return an NaN map.

Is it a potential bug?

not N * C * 1 * 1 but N * C * k * k

FilterNorm is not used after Adaptivepool, but applied after the SE structure.

not N * C * 1 * 1 but N * C * k * k

FilterNorm is not used after Adaptivepool, but applied after the SE structure.

It is used after AdaptiveAvgPool2d:

ddfnet/ddf/ddf.py

Lines 191 to 200 in b3be7fa

    
           def build_channel_branch(in_channels, kernel_size, 
        
                                    nonlinearity='relu', se_ratio=0.2): 
        
               assert se_ratio > 0 
        
               mid_channels = int(in_channels * se_ratio) 
        
               return nn.Sequential( 
        
                   nn.AdaptiveAvgPool2d((1, 1)), 
        
                   nn.Conv2d(in_channels, mid_channels, 1), 
        
                   nn.ReLU(True), 
        
                   nn.Conv2d(mid_channels, in_channels * kernel_size ** 2, 1), 
        
                   FilterNorm(in_channels, kernel_size, 'channel', nonlinearity, running_std=True))

Please reopen the issue if you don't mind

As the code you mentioned, Filter Norm is applied after the "nn.Conv2d(mid_channels, in_channels * kernel_size ** 2, 1)", instead of "nn.AdaptiveAvgPool2d((1, 1))".

So, FilterNorm is not used after Adaptivepool2d, but applied after the SE structure. Since there is the SE structure, the input size will be N * C * k * k

No matter what feature map get into channel_branch, AdaptiveAvgPool2d (line 196) spatially squeezed it into 1*1, and Conv2d with kernel=1 (line 197), ReLU (line 198), and Conv2d with kernel=1 (line 199) did not change the spatial size. So the input feature map to FilterNorm in line 200 will always be 1*1.

Oh, I got your point. Yes, strictly speaking, the feature maps is 1 * 1 in spatial. However, its total size is (N, (C * k * k), 1, 1). So, if you reshape it in the FilterNorm, it becomes N, C, k * k (strictly speaking). Anyway, the number of elements is N * C * k * k

OK, now I got why I encoutered this error.

When instantiated a DDFPack module, I passed kernel_size=1, which make (N, C, k, k) became (N, C, 1, 1) and raised that bug.

ddfnet/ddf/ddf.py

Lines 203 to 217 in b3be7fa

    
           class DDFPack(nn.Module): 
        
               def __init__(self, in_channels, kernel_size=3, stride=1, dilation=1, head=1, 
        
                            se_ratio=0.2, nonlinearity='relu', gen_kernel_size=1, kernel_combine='mul'): 
        
                   super(DDFPack, self).__init__() 
        
                   self.kernel_size = kernel_size 
        
                   self.stride = stride 
        
                   self.dilation = dilation 
        
                   self.head = head 
        
                   self.kernel_combine = kernel_combine 
        
                   self.spatial_branch = build_spatial_branch( 
        
                       in_channels, kernel_size, head, nonlinearity, stride, gen_kernel_size) 
        
                   self.channel_branch = build_channel_branch( 
        
                       in_channels, kernel_size, nonlinearity, se_ratio)

I suggest to add an assertion before calling build_channel_branch to ensure in this branch the kernel size ≠ 1.

Thanks for the advice, I have updated it

	elif self.filter_type == 'channel':
	b = x.size(0)
	c = self.in_channels
	x = x.reshape(b, c, -1)
	x = x - x.mean(dim=2).reshape(b, c, 1)
	x = x / (x.std(dim=2).reshape(b, c, 1) + 1e-10)
	x = x.reshape(b, -1)
	if self.runing_std:
	x = x * self.std[None, :]
	else:
	x = x * self.std
	if self.runing_mean:
	x = x + self.mean[None, :]

	def build_channel_branch(in_channels, kernel_size,
	nonlinearity='relu', se_ratio=0.2):
	assert se_ratio > 0
	mid_channels = int(in_channels * se_ratio)
	return nn.Sequential(
	nn.AdaptiveAvgPool2d((1, 1)),
	nn.Conv2d(in_channels, mid_channels, 1),
	nn.ReLU(True),
	nn.Conv2d(mid_channels, in_channels * kernel_size ** 2, 1),
	FilterNorm(in_channels, kernel_size, 'channel', nonlinearity, running_std=True))

	class DDFPack(nn.Module):
	def __init__(self, in_channels, kernel_size=3, stride=1, dilation=1, head=1,
	se_ratio=0.2, nonlinearity='relu', gen_kernel_size=1, kernel_combine='mul'):
	super(DDFPack, self).__init__()
	self.kernel_size = kernel_size
	self.stride = stride
	self.dilation = dilation
	self.head = head
	self.kernel_combine = kernel_combine

	self.spatial_branch = build_spatial_branch(
	in_channels, kernel_size, head, nonlinearity, stride, gen_kernel_size)

	self.channel_branch = build_channel_branch(
	in_channels, kernel_size, nonlinearity, se_ratio)