different number of train samples in prepared json

Question

different number of train samples in prepared json

nitba opened this issue 6 years ago · 3 comments

I produced json file for MPII using CPM code , and I noticed that we have different number of training samples and the same validation samples, Why it happens? I noticed that you have an extra parameter loc2 to check img_index and why do we have this part find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1) should be difference equal to two?I would appreciate your comment

Answer 1 · 2019-02-11T17:45:28.000Z

find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1) aims to find the validation image and the validation person index in that image (one image may contain multiple persons)
I use an additional check (loc2) to make sure there is no overlapped image between the training set and the validation set.

The statics should be as follows:

	CPM	Ours
Training samples	25925	22246
Validation samples	2958	2958

As you can see, CPM may use potentially the same images for training and validation (e.g, Image 100 person 1 for training, and image 100 person 2 for validation). I cannot say this is not good but I just want to do experiments on cleaner train/valid splits.

Answer 2 · 2019-02-12T03:18:44.000Z

Thanks @bearpaw
Would you please provide json file for test set of mpii, for checking it's number of samples?

Answer 3 · 2019-02-14T01:15:02.000Z

I will try to generate it when I get spare time. But you can modify the matlab code and generate it.