Question about the new push for partial fine-tuning.
Closed this issue · 2 comments
Hello! For the latest update of push, I find that there is a modification on partial fine-tuning. I see that you only train the param which has a name with "temporal_transformer_block". I wonder if there is any source related to why we should do that. Thank you so much for your attention and participation.
Hi, I was training the entire Unet in the initial version, which wasn't a very suitable choice, as most people would encounter out-of-memory (OOM) issues. In many fine-tuning scenarios, our goal is for the model to learn a specific motion, so it makes intuitive sense to only unlock the time-related block.
Of course, this isn't set in stone and needs to be modified according to the specific task at hand.
I understand now. I think that this is a very brilliant step because it suddenly drops the memory that is needed. Thanks!