GengDavid/pytorch-cpn

Other joints

ttdd11 opened this issue · 4 comments

Awesome repo!

I've trained using ground truth boxes and it works okay but the ap is just lower.

Do you think that the the detection box should affect training seeing as it's increase so significantly in the image preprocessing?

Do you think this network architecture needs to be improved if I'm using more joints than just COCO?

I'm sorry, I'm not sure what you exactly mean by saying "Do you think that the the detection box should affect training seeing as it's increase so significantly in the image preprocessing?" Do you mean training with detection bbox?

For another question, I think it depends on the difficulty of your data rather than just the number of joints.

@GengDavid thanks for the reply.

Sorry my first question wasn't clear. In using the bbox from the COCO set, how important do you think using the ground truth values is, seeing as the box size is increased by 25 percent during training. The result of this increased size makes multiple people visible in some images, which makes me think that it's accuracy isn't too crucial.

In terms of the joints, I think my question is more of a training question. How much more training do you think it required for additional joints? Do you think the addition of other joints would significantly affect the map of the coco set?

Thanks for you help!

Another training question. You guys used the 2017 training set where as the paper used the 2014 set. Do you imagine that there will be any differences as a result of the year or probably not? Perhaps in terms of the amount of training required as the 2014 set is smaller.

The size of bbox is enlarged because the bbox can not always cover all keypoints for a single person since it is labeled for detection but not for keypoints task.
For the second question, I still think it depends on the distribution of your data. It hard to say "how much more training" it needed. Adding some additional joints on COCO may have some effects, but it also depends on how "importance" the joint is.

Training data set is almost the same with the original implementation(maybe only one or two images are removed in the original implementation).