noahzn/Lite-Mono

Some question about the result

Zora137 opened this issue · 4 comments

hello,sorry to bother you many times
this is the result I run your code by this command without pretain

python train.py --data_path path/to/your/data --model_name mytrain --num_epochs 30 --batch_size 12 --lr 0.0001 5e-6 31 0.0001 1e-5 31

image

It cannot reach the resulin your papert without pretain:
image

is this normal? or did dosomething wrong?
expect your answer,thank you so much

Hi, the results are too bad. As stated in our paper:

For models trained from scratch an initial learning rate of 5e−4 with a cosine learning rate schedule [26] is adopted, and the training epoch is set to 35.

Could you please try using a larger initial learning rate?

Hi, the results are too bad. As stated in our paper:

For models trained from scratch an initial learning rate of 5e−4 with a cosine learning rate schedule [26] is adopted, and the training epoch is set to 35.

Could you please try using a larger initial learning rate?

hi nice work !!!
can you show me your args file ,I do many times the ruseltalways this

QQ图片20240418140654
trained by lite-mono

Hi, the results are too bad. As stated in our paper:

For models trained from scratch an initial learning rate of 5e−4 with a cosine learning rate schedule [26] is adopted, and the training epoch is set to 35.

Could you please try using a larger initial learning rate?

hi nice work !!! can you show me your args file ,I do many times the ruseltalways this

QQ图片20240418140654 trained by lite-mono

Hi, you can try setting the learning rate to --lr 0.0001 5e-6 16 0.0001 1e-5 16. drop_path can be set to 0.3. But this might cause your training not converging. Please make sure you are using the same dependencies as we used. #58

Also, please check the results of each epoch, not only the last epoch. The best result should be achieved at an earlier epoch.

嗨,结果太糟糕了。正如我们的论文所述:

对于从头开始训练的模型,采用余弦学习率计划[26]的初始学习率为5e−4,训练周期设置为35。

你能尝试使用更大的初始学习率吗?

嗨,干得好!!你能给我看看你的args文件吗,我做了很多次ruselt总是这个
QQ图片20240418140654由 Lite-Mono 训练

嗨,您可以尝试将学习率设置为。 可以设置为 。但这可能会导致您的训练无法收敛。请确保您使用的依赖项与我们使用的依赖项相同。#58--lr 0.0001 5e-6 16 0.0001 1e-5 16``drop_path``0.3

另外,请检查每个纪元的结果,而不仅仅是最后一个纪元。最好的结果应该在更早的时代实现。
ok thanks!!! i will try it