SysCV/sam-hq

when i run train.py with one gpu and dataset that sam_hq used, the process stopped and didn't move

Ryanye2000 opened this issue · 5 comments

Sorry to borther again
屏幕截图 2023-11-27 105211

when i run train.py with one gpu and dataset that sam_hq used, the process stopped and didn't move

/user75/sam-hq/train# python -m torch.distributed.launch --nproc_per_node=1 train.py --checkpoint ./pretrained_checkpoint/sam_vit_b_01ec64.pth --model-type vit_b --output work_dirs/hq_sam_b

I also tried this but still no response, it just stop in this line and no bug came out

image It's weired, i tried to run the demo and it still no response and no bug came out
lkeab commented

hi can you instead run "python -m pdb demo/demo_hqsam.py" to see which code line is stuck?

hi can you instead run "python -m pdb demo/demo_hqsam.py" to see which code line is stuck?

i handled it by switching ”import torch“ and "import os"

lkeab commented

this may indicate your cuda version and pytorch version mismatches