WangFeng18/Swin-Transformer

swin transformer can not converge with large trainset.

Opened this issue · 0 comments

I train the tiny model with one million classes and 100 million images with softmax loss and adamw, the batch size is 600 and train for 400,000 iterations but the model can not converge.