backbone pretraining details
Closed this issue · 4 comments
I had some questions regarding the pretraining that was done previous to the end-to-end training for the algebraic triangulation. From the paper I saw that it was first trained on the COCO dataset and then finetuned on MPII+Human3.6 (assuming MPII+CMU for CMU dataset). Was the pretraining done using the softmax+softargmax combo with soft MSE? or is it similar to other pose papers that do MSE on the heatmaps?
Thanks!
Hi @pablovela5620 ,
For the case of CMU a pretrained model from "simple baselines" paper was used without any fine-tuning.
As far as I remember, for Human3.6M we used the standard MSE loss (also from "simple baselines") for 16+17 predicted heatmaps.
Perfect, thank you for the response. Did you train the simple baseline models yourselves? or did you use the publicly available pretrained models from this repo https://github.com/microsoft/human-pose-estimation.pytorch
If you did train it yourselves, about how long did the training take and what kind of hardware did you use? (multiple 1080ti's? v100's?). I had previously seen that you'd stated it took about 5 days of training time for the volumetric model, I'm assuming this does not include the pretraining for the backbone.
Lastly, why the choice of simple baselines over HRNet?
We used the public pretrained models.
We finetuned the 2D models it on different GPUs that were available (2080Tis or 1080Tis). As far as I remember it was enough to finetune the model for a few epochs, so it was hours of train time.
HRNet was released in the midst of our experiments (not long before the ICCV2019 deadline), so after some internal discussion, we decided to stick with simple baselines.
Gotcha, appreciate the information. Thanks