Long time consumed during build_score_decoder
MrNeoBlue opened this issue · 4 comments
Hello, first of all thanks for share this wonderful work~
My problem is when I tried to test mixformer2_vit_online model on got10k and lasot. The initialization cost tens of minutes on A100. I monitored the process and found out, the process was stucked in the function build_score_decoder
inside build_mixformer_vit_online
. It seems that only 10MB was being loaded to GPU every 5 or 10 seconds. Would there be any solution to this issue?
Also, Im a little confused with the version of the models. Point it out if I was wrong:
mixformer2_vit_online stands for MixFormerV2-B-256?
mixformer2_stu stands for MixFormerV2-S?
Can you check your test script because the it should call function build_mixformer2_vit_online
as below instead of build_mixformer_vit_online
.
mixformer2_vit_online and mixformer_stu are not for MixFormerV2-B-256 and MixFormerV2-S.
mixformer2_stu is for distillation training, and mixformer2_vit_online is for score head training.
The differences between MixFormerV2-B and -S lie in some hyperparameters, such as image size and model depth, which is set in the configuration file in experiments
directory.
The build_mixformer2_vit_online
was the imported funciton name. Im sure the called function build_mixformer_vit_online
is from lib/models/mixformer2_vit_online.py
. The starting process cost a lot of time during the sub-function build_score_decoder
.
So you mean during distillation, the score head is frozen? And only distillation and pruning on the MixCvT backbone?
UPDATE:
To be more accurate the code stuck during process line 27 of lib/models/mixformer2_vit/head.py
with torch.no_grad(): self.indice = torch.arange(0, feat_sz).unsqueeze(0).cuda() * stride # (1, feat_sz)
upgrade torch to 1.7.1 with cuda 11.0 solved my problem.