
About train details

wjc0602 opened this issue · 1 comments

Hello, I'd like to ask you for specific training details in this job. You said you used 8 GPU and each epoch used 40 million pairs of samples. Then, did all the epochs add up to 4 billion? How long did you train in total?

sorry for the mistakes in paper. The total sample pairs is 40 million, Not each epoch,
Each epoch has 400 thousand sample pairs and total 100 epochs.