Change the code to use multi-gpu, but can not speed up the training.
Opened this issue · 2 comments
Use DataParallel model to start a multi-gpu training, change the config.yaml batch size, can not speed up the training.
I've also tried multi-gpus and had same issue with you.
Then I found the biggest overhead is in GE2ELoss part. Especially get cosine similarity matrix and calculate loss part.
Just copy and paste this codes to your utils.py.
I don't know why author didn't have merged this codes yet, but it is much faster than original codes.
I've also tried multi-gpus and had same issue with you.
Then I found the biggest overhead is in GE2ELoss part. Especially get cosine similarity matrix and calculate loss part.Just copy and paste this codes to your utils.py.
I don't know why author didn't have merged this codes yet, but it is much faster than original codes.
Thank you very much, I will try to use the code in the link.