Possible checkpoint loss in TAPE pretraining scripts
Yijia-Xiao opened this issue · 0 comments
Yijia-Xiao commented
In pretraining scripts for TAPE mode, there are two commands removing all contents in checkpoint folder, this may cause lost of previous checkpoints.
https://github.com/THUDM/ProteinLM/blob/main/pretrain/examples/pretrain_tape_distributed.sh#L17
https://github.com/THUDM/ProteinLM/blob/main/pretrain/examples/pretrain_tape.sh#L8