what is for "reference point"?

Question

what is for "reference point"?

Closed this issue 5 years ago · 2 comments

Hello,
I found "references=[4,8,12,16]" is used to get relative loc/angles.
each point corresponds to face, left hand, right hand, left foot in NTU-RGB+D dataset figure

I'm trying to apply pb-gcn on COCO style skeleton(17 point).

should I use "[0, 7, 4, 13]" for coco 17 keypoints to match NTURGBD style skeleton?

Plus, I have wanted to know how to get the weighted sum of each features of body parts using eq(16) in your paper.

below code lists in "st_gconv_resnet.py" are for that?

        # model
        x = self.head(x)

        for layer in self.layers:
            x = layer(x)

        # V pooling
        x = F.avg_pool2d(x, kernel_size=(1, V))

        # M pooling
        c = x.size(1)
        t = x.size(2)
        x = x.view(N, M, c, t).mean(dim=1).view(N, c, t)

        # T pooling
        x = F.avg_pool1d(x, kernel_size=x.size()[2])

        # C fcn
        x = self.fcn(x)
        x = F.avg_pool1d(x, x.size()[2:])
        x = x.view(N, self.num_class)

Thank you~!

Answer 1 · 2019-06-18T15:42:41.000Z

Hello!

First of all, the points [4, 8, 12, 6] correspond to [left shoulder, right shoulder, left hip, right hip] in the NTURGB+D figure (what you've written is from UTKinect-Action). Note that, here the joint indices are 0-indexed and hence the actual joint numbers from the NTURGB+D figure are [5, 9, 13, 17].
Please have a look at the citation in my paper from which I use the evidence for these "reference joints".

Hence, for the COCO style annotations (note that the joint numbers are 0-indexed already in the figure), you can use joints [5, 2, 11, 8] as [left shoulder, right shoulder, left hip, right hip]. This answers your first question.

As for your second question, you need to look at the script st_residual_unit.py.
The aggregation function that I've used is simply addition of responses at overlapping nodes (as I've written in the paper). To understand the entire process:
1. You can see on line 30 that I calculate how many partitions I have in the graph, which is equal to the number of parts.
2. I create N convolution kernels (N = number of parts calculated in step 1) as seen on line 45.
3. In the forward call, I calculate the responses for nodes in each part using the convolution kernel for that part and these responses are "added" together, which means the responses for overlapping nodes across parts are summed up. You can see this in the loop starting on line 87.

If you want to employ any special aggregation function F, instead of adding the values, apply F to the two values, such as: y = F(y, conv_list[i](xa)). In my case, F(y, conv_list[i](xa)) = y + conv_list[i](xa) as seen on line 93. This answers your second question.

Answer 2 · 2019-07-10T14:03:15.000Z

Closing this issue assuming you must've resolved your query.
Feel free to open it if need be.