Unstable training loss
Opened this issue · 0 comments
LiuBodan commented
Hi,
I tried to reproduce the model, however, my training loss is very unstable:
For preprocessing:
I downloaded the dataset from the BBBC021 official website, then used the two CellProfiler pipelines provided to produce the training data, where I preprocessed annotated data with DMSO data together. (Should I do the preprocessing for annotated and DMSO separately?)
I trained the model with the same parameters on 2x Tesla V100 using "python -m torch.distributed.launch --nproc_per_node=2 WS-DINO_BBBC021.py"
Could anyone point out where I did it wrong?
Many thanks