JierunChen/FasterNet

FasterNet as bakebone on Object Dtection

Opened this issue · 0 comments

Hello author, When the model enters the PatchMerging module, the number of parameters is unusually large and the program gets a RuntimeError, the example is as follows:

                  from  n    params                      module                arguments
  0                -1    1     3200              models.common.PatchEmbed     [3, 64, 4, 4]
  1                -1    1     18944             models.common.MLPBlock            [64, 64]
  2                -1    1     134217984        models.common.PatchMerging        [64, 128]                     
  3                -1    2     150528            models.common.MLPBlock          [128, 128]                    
  4                -1    1     2147484160      models.common.PatchMerging        [128, 256]                   
  5                -1    2     600064            models.common.MLPBlock          [256, 256]  
  
 RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:75] data. DefaultCPUAllocator: not enough memory: you tried to allocate 137438953472 bytes. Buy new RAM!

Why does this happen? I see that the trained model FlOPs and parameters posted by the author are relatively low.

This is my module code

class Partial_conv3(nn.Module):

    def __init__(self, dim, n_div):
        super().__init__()
        self.dim_conv3 = dim // n_div
        self.dim_untouched = dim - self.dim_conv3
        self.partial_conv3 = nn.Conv2d(self.dim_conv3, self.dim_conv3, 3, 1, 1, bias=False)

    def forward(self, x: Tensor) -> Tensor:
        # for training/inference
        x1, x2 = torch.split(x, [self.dim_conv3, self.dim_untouched], dim=1)
        x1 = self.partial_conv3(x1)
        x = torch.cat((x1, x2), 1)

        return x


class MLPBlock(nn.Module):
    def __init__(self,
                 dim,
                 drop_path=0,
                 n_div=4,
                 mlp_ratio=2.
                 ):
        super().__init__()
        self.dim = dim
        self.mlp_ratio = mlp_ratio
        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
        self.n_div = n_div

        mlp_hidden_dim = int(dim * mlp_ratio)

        mlp_layer: List[nn.Module] = [
            nn.Conv2d(dim, mlp_hidden_dim, 1, bias=False),
            nn.BatchNorm2d(mlp_hidden_dim),
            nn.ReLU(),
            nn.Conv2d(mlp_hidden_dim, dim, 1, bias=False)
        ]

        self.mlp = nn.Sequential(*mlp_layer)

        self.spatial_mixing = Partial_conv3(
            dim,
            n_div,
        )

        def forward(self, x: Tensor) -> Tensor:
            shortcut = x
            x = self.spatial_mixing(x)
            x = shortcut + self.drop_path(self.mlp(x))
            return x



class PatchEmbed(nn.Module):

    def __init__(self, in_chans, embed_dim, patch_size=4, patch_stride=4, bias=False):
        super().__init__()
        self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=patch_stride, bias=bias)
        self.norm = nn.BatchNorm2d(embed_dim)

    def forward(self, x: Tensor) -> Tensor:
        x = self.norm(self.proj(x))
        return x


class PatchMerging(nn.Module):

    def __init__(self, dim, patch_size2=2, patch_stride2=2):
        super().__init__()
        self.reduction = nn.Conv2d(dim, 2 * dim, kernel_size=patch_size2, stride=patch_stride2, bias=False)
        self.norm = nn.BatchNorm2d(2 * dim)

    def forward(self, x: Tensor) -> Tensor:
        x = self.norm(self.reduction(x))
        return x

This is my model configuration file

depth_multiple: 1.0  # model depth multiple
width_multiple: 0.5  # layer channel multiple
backbone:
  # [from, number, module, args]
  [[-1, 1, PatchEmbed, [128]],    # 0-P2/4 [640, 640, 3] -> [160, 160, 128]
   [-1, 1, MLPBlock, [128]],       # 1
   [-1, 1, PatchMerging, [256]],   # 2-P3/8 [160, 160, 256] -> [80, 80, 256]
   [-1, 2, MLPBlock, [256]],       # 3
   [-1, 1, PatchMerging, [512]],   # 4-p4/16 [80, 80, 256] -> [40, 40, 512]
   [-1, 2, MLPBlock, [512]],       # 5
   [-1, 1, PatchMerging, [1024]],   # 6-p5/32 [40, 40, 512] -> [20, 20, 1024]
   [-1, 2, MLPBlock, [1024]],

   [-1, 1, SPPCSPC, [1024]], # 8 [20, 20, 1024] -> [20, 20, 1024]