rkosti/emotic

Emotic CNN training setup question

Closed this issue · 1 comments

Hello Ronak,

Thank you for providing the code for this project.

I have a small doubt regarding the training setup.

In the opts file, the 'nEpochs' parameter has the default value of 14 epochs (https://github.com/rkosti/emotic/blob/master/opts.lua#L22) whereas in the main file there is a comment which says 'uncomment to use different nEpochs other than 21' (https://github.com/rkosti/emotic/blob/master/main.lua#L14). For how many epochs have you trained the models?

Also, I got slightly confused when reading through the materials and need a small clarification. In your thesis, you have mentioned that pretrained AlexNet is used for extracting person features/body features. In contrast, in the papers (PAMI, CVPR), you have indicated that both feature extraction modules are based on DecomposeMe.
Please clarify about the feature extraction modules.

Regards
Abhishek

Dear @Tandon-A . Thanks for pointing this out. I used 21 epochs for training (I allowed the models to overfit, then used a combination of stopping parameters to choose the best epoch).

DecomposeMe is designed similar to Alexnet (albeit with 1D conv kernels), and trained on Imagenet and Places.

We used DecomposeMe based models for training for CVPR and PAMI submissions (for both person and image features). After that, I switched to Alexnet since DecomposeMe models had problems with training, I could not stabilize their training in the design of fusion model. I switched back to Alexnet for person features, and since I wrote my thesis after the PAMI submisison, I mention this in my thesis.

I would recommend using a standard pre-trained ResNet (trained on Imagenet) for person features and another ResNet (pretrained on Places dateset) for image features. We already have better performance using these models and tweaking some training parameters.

I hope this helps.