Fine-tuning SH procedure question

Question

Fine-tuning SH procedure question

maf2418 opened this issue 5 years ago · 6 comments

Hi I was trying to reproduce an element of your research. In looking at the fine-tuned SH data you provide, I notice the mean pixel error of the SH for subject 11 is better than the error of the training set, while subject 9 is about 4 pixels worse. I am slightly surprised 11 is so good and I wanted to confirm whether the fine-tuning was done on just the training subjects or the full dataset? (Normally I would expect the former, but I guess the latter might also have been done since the paper focuses on lifting accuracy). Clarifying that would help my understanding.
cheers

Answer 1 · 2019-08-04T16:29:58.000Z

Thanks for your question. We only measured these errors during validation, and we used the same train/test partitions for fine-tuning SH as for the 2d->3d part.

I'm a bit confused by your question though. As far as I remember, the test set consists of subjects 9 and 11, so what exactly is surprising about one being better and one worse? In my head that sounds pretty intuitive.

Answer 2 · 2019-08-04T16:52:13.000Z

sorry bad typo on my part, I meant S11 has smaller errors than the training set ( I mistakenly wrote test set, sorry it was late!, will edit my question to not confuse later readers).
Thanks for clarifying what you did. When I tried fine tuning I ended up with my training detections much more accurate than my validation detections, which creates problems when doing the lifting. I suspect I overlearned the training data in the fine tuning part, wheras your detector seems to do nearly as well on S9/S11 as training.
I did not follow your exact methodology though so I will try fine tuning again. Thanks again for the prompt clarification. cheers

Answer 3 · 2019-08-04T20:36:18.000Z

You may find #20 (comment) useful to reproduce our SH fine-tuning.

Cheers,

Answer 4 · 2019-08-15T16:15:54.000Z

Just coming back a week later to comment I resolved the problem. Boneheaded mistake on my part!I thought your provided dataset "StackedHourglass" was the fine tuned set so I was looking at the wrong data... I did not realize you had a separately provided "StackedHourglassFineTuned240" set. Obviously the clean stacked hourglass set should show comparable results for the testing and validation sets since it was never tuned on Human3.6m data, and as I expected the FineTuned dataset does better on training than the validation.

Just reporting back for politeness and not to confuse anybody else in the future.
cheers,
Martin

Answer 5 · 2019-08-15T17:29:39.000Z

Oh thanks so much for clearing that up! :)

Answer 6 · 2020-09-24T02:42:23.000Z

Hi I was trying to reproduce an element of your research. In looking at the fine-tuned SH data you provide, I notice the mean pixel error of the SH for subject 11 is better than the error of the training set, while subject 9 is about 4 pixels worse. I am slightly surprised 11 is so good and I wanted to confirm whether the fine-tuning was done on just the training subjects or the full dataset? (Normally I would expect the former, but I guess the latter might also have been done since the paper focuses on lifting accuracy). Clarifying that would help my understanding.
cheers

Hi maf2418, I also applied the fine-tuned SH data from "https://drive.google.com/open?id=0BxWzojlLp259S2FuUXJ6aUNxZkE" for many times, but never get reply. Could you share the data with me? Or could you please teach me what to write when you apply? Thank you in advance.