lmy98129/VLPD

local_rank

hat783 opened this issue · 1 comments

when trying to run dist_train.sh, i see a error related to --local_rank in trianval_distributed.py

usage: trainval_distributed.py [-h] [--work-dir WORK_DIR] [--local_rank LOCAL_RANK]
trainval_distributed.py: error: unrecognized arguments: --local-rank=1
usage: trainval_distributed.py [-h] [--work-dir WORK_DIR] [--local_rank LOCAL_RANK]
trainval_distributed.py: error: unrecognized arguments: --local-rank=0
[2023-12-08 01:40:46,962] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 2) local_rank: 0 (pid: 53342) of binary: /home/ashitha/miniconda3/envs/Ashenv/bin/python

could you please help me in resolving

Please check the version of PyTorch you are using, if it is inconsistent (typically too high) with the 1.10.0+cu113 in our provided environment requirement in environment.yaml, this issue will occur like SysCV/sam-hq#41

PS: Please follow the steps of instructions in README.md to setup the environment.