showlab/EgoVLP

EgoVLP_FT_EPIC* checkpoint of a model trained on another task or with another architecture

iranroman opened this issue · 5 comments

Hello EgoVLP,

Thank you very much for sharing your work with the board community.

I'm interested in using the model you submitted to the Epic-Kitchens Multi-Instance Retrieval challenge. I'll use it as starting point for further research we are doing in our lab!

I was able to use the code and load the model provided here EgoVLP_FT_EPIC* (the file name is epic_mir_plus.pth). However, I got the following message:

- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Does this message mean that I have to fine-tune the model to obtain the numbers in your README's table with Epic-Kitchens results?

Thank you!

Hi, @iranroman
I think you do not need to worry about this message, you do not need to fine-tune after loading this checkpoint, directly inference should be able to reproduce the number.

Hi @QinghongLin, can you also provide the config file for this checkpoint training? It would be nice to reproduce this performance and that would help others working on this challenge. Thanks in advance.

Hi @thechargedneutron ,
Currently, this checkpoint should be able to reproduce performance directly with inference, not additional fine-tuning.
we have a plan to re-organize our codebase including config, and I will include your request.

@QinghongLin I was indeed able to reproduce the results using the checkpoint. I got the exact same numbers, so that is very assuring. Thanks for clarifying.

@thechargedneutron I would be curious to hear if you have tried to reproduce the fine-tuning routine using the current code base. If you have tried but were not able to reproduce the fine-tuning results, which decisions did you have to make that could have led to different results. Any insight is appreciated, as I'm also working on reproducing the fine-tuning routine with Epic-Kitchens.

I am able to fine-tune and obtain very similar numbers using configs/ft/epic.json. What problem are you facing when trying to reproduce the results? Just that I am not using an SSD and hence the training is slower.