ToughStoneX/DGCNN

Question about the order of maxpool and conv

Closed this issue · 2 comments

Hi, Thanks for you sharing. Recently I am also using the EdgeConv in my model but I have some question about the implementation.

In your implementation and the offical implemtation. You both do convolution on tensor with size [B, N, K, 2*F] and then max pooling to get the output with size [B, N, an].

        # reshape, x_in: [B, 2*F, N*K]
        x_in = x_in.reshape([B, 2 * F, N * self.K])

        # out: [B, an, N*K]
        out = self.mlp(x_in)
        _, an, _ = out.shape
        # print(out.shape)

        out = out.reshape([B, an, N, self.K])
        # print(out.shape)
        # reshape, out: [B, an, N, K]
        out = out.reshape([B, an*N, self.K])
        # print(out.shape)
        # reshape, out: [B, an*N, K]
        out = nn.MaxPool1d(self.K)(out)
        # print(out.shape)
        out = out.reshape([B, an, N])
        # print(out.shape)
        out = out.permute(0, 2, 1)
        # print(out.shape)

But I wonder why first do the max pooling then convolution ? something Like this:

# [B, N, K, 2*F] 
Maxpool
# [B, N, 2*F]
MLP
# [B, N, an]

I think it is the same and many reshape operation can be removed. But also in the origin implementation, the author first do convolution and then max pool. Do you have any idea on why this order ?

In the paper of DGCNN, they proposed to build a dynamic graph on the local region of point clouds. In fact the dynamic graph is built by knn. The K dimension represent the local region of each point. And the max pooling is used to aggregate feature from the local region. Then convolution is applied to propcess the feature of the entire point cloud.

Thanks ! It is clear now.