bearpaw/pytorch-pose

different number of train samples in prepared json

nitba opened this issue · 3 comments

nitba commented

Hi @bearpaw ,

I produced json file for MPII using CPM code , and I noticed that we have different number of training samples and the same validation samples, Why it happens? I noticed that you have an extra parameter loc2 to check img_index and why do we have this part find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1) should be difference equal to two?I would appreciate your comment

  1. find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1) aims to find the validation image and the validation person index in that image (one image may contain multiple persons)

  2. I use an additional check (loc2) to make sure there is no overlapped image between the training set and the validation set.

The statics should be as follows:

CPM Ours
Training samples 25925 22246
Validation samples 2958 2958

As you can see, CPM may use potentially the same images for training and validation (e.g, Image 100 person 1 for training, and image 100 person 2 for validation). I cannot say this is not good but I just want to do experiments on cleaner train/valid splits.

nitba commented

Thanks @bearpaw
Would you please provide json file for test set of mpii, for checking it's number of samples?

I will try to generate it when I get spare time. But you can modify the matlab code and generate it.