cheerss/CrossFormer

how to train segmentation in win10

jsong0041 opened this issue · 0 comments

Dear author, I have a question: how to train segmentation in win10?
I used the "python train.py configs/fpn_crossformer_s_ade20k_40k.py --cfg-options pretrained/backbone-corssformer-s.pth --work-dir output --launcher pytorch" but got an error msg as follows:

Traceback (most recent call last):
File "train.py", line 152, in
main()
File "train.py", line 65, in main
args = parse_args()
File "train.py", line 57, in parse_args
args = parser.parse_args()
File "C:\Python37\lib\argparse.py", line 1755, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "C:\Python37\lib\argparse.py", line 1787, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "C:\Python37\lib\argparse.py", line 1993, in _parse_known_args
start_index = consume_optional(start_index)
File "C:\Python37\lib\argparse.py", line 1933, in consume_optional
take_action(action, args, option_string)
File "C:\Python37\lib\argparse.py", line 1861, in take_action
action(self, namespace, argument_values, option_string)
File "C:\Python37\lib\site-packages\mmcv\utils\config.py", line 739, in call
key, val = kv.split('=', maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)

and I also tried to use ur shell (dist_train.sh) directly, but also got an error as

$ /bin/sh E:/project_c/crossformer-debug/segmentation/dist_train.sh
NOTE: Redirects are currently not supported in Windows or MacOs.
C:\Python37\lib\site-packages\torch\distributed\launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

FutureWarning,
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 564, in determine_local_world_size
return int(nproc_per_node)
ValueError: invalid literal for int() with base 10: ''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 193, in
main()
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 189, in main
launch(args)
File "C:\Python37\lib\site-packages\torch\distributed\launch.py", line 174, in launch
run(args)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 709, in run
config, cmd, cmd_args = config_from_args(args)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 617, in config_from_args
nproc_per_node = determine_local_world_size(args.nproc_per_node)
File "C:\Python37\lib\site-packages\torch\distributed\run.py", line 582, in determine_local_world_size
raise ValueError(f"Unsupported nproc_per_node value: {nproc_per_node}")
ValueError: Unsupported nproc_per_node value:

so can u give some suggestions for solutions?
Thanks.