What is the different between valid-ours.h5 and valid-example.h5

Question

What is the different between valid-ours.h5 and valid-example.h5

Opened this issue 8 years ago · 5 comments

Hi, I notice that valid-example.h5 is better than valid-ours.h5. Would you please tell me the difference between these two results? Thank you.

Answer 1 · 2016-07-28T15:06:13.000Z

Moreover. I found that the released results on validation set is slightly worse than the results on the TEST set:

method	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
valid-example	95.80	94.21	87.40	82.75	86.03	81.83	78.32	86.76
valid-ours	95.94	94.68	88.53	83.38	87.48	83.09	79.05	87.56

Is the released model different against the model in your arxiv paper? Thanks in advance!

Answer 2 · 2016-07-28T15:21:29.000Z

valid-ours.h5 is currently a bit outdated as well as the released model. With that said, I generally found validation performance to be around 2% worse than test set performance.

valid-example.h5 is the file that gets written to when running the validation demo code, so if someone runs with a different model it was just a way to distinguish it from our baseline performance. I just put an arbitrary set of predictions as filler there so if the evaluation code was run it would show different curves. I didn't think much about it at the time and in hindsight I realize that might be a bit confusing, sorry about that!

Answer 3 · 2017-09-07T04:00:30.000Z

@anewell Can you please clarify which model architecture was used to generate valid-ours.h5? Was it the 8-stack hourglass used to produce Table 2 in your paper? I wish to use these results as a reference point during development.

Answer 4 · 2017-09-11T16:11:56.000Z

Hi, valid-ours.h5 is from an old version of the model, and does not correspond to the output of the 8-stack network. Sorry for any confusion that might cause. You can download the 8-stack model though and run the evaluation to see how it does.

Answer 5 · 2017-09-12T02:26:55.000Z

Thanks for your reply. I've found a small bug in the evaluation code (see #14), but other than that I am able to generate validation set predictions just fine.

Would you consider accepting a PR to add the validation set predictions into the preds/ folder (eg as valid-hg8.h5 or perhaps replacing valid-ours.h5)? It would be very helpful to have an "official" validation prediction set for your highly influential model, and would clear up confusion for people who wrongly believe that valid-ours.h5 represents your peak performance.