HikariTJU/LD

subprocess.CalledProcessError: Command '[]' returned non-zero exit status 1.

Closed this issue · 2 comments

抱歉打扰了,我在复现您的代码时遇到了以下问题。
安装好相应包后我想试试代码能不能跑通,因为只有一个GPU,我将后面参数改为1
/tools/dist_train.sh configs/ld/ld_r50_gflv1_r101_fpn_coco_1x.py 1

报错如下:
Traceback (most recent call last):
File "./tools/train.py", line 15, in
from mmdet.apis import set_random_seed, train_detector
File "/home/cs/LD/mmdet/apis/init.py", line 1, in
from .inference import (async_inference_detector, inference_detector,
File "/home/cs/LD/mmdet/apis/inference.py", line 10, in
from mmdet.core import get_classes
File "/home/cs/LD/mmdet/core/init.py", line 5, in
from .mask import * # noqa: F401, F403
File "/home/cs/LD/mmdet/core/mask/init.py", line 2, in
from .structures import BaseInstanceMasks, BitmapMasks, PolygonMasks
File "/home/cs/LD/mmdet/core/mask/structures.py", line 6, in
import pycocotools.mask as maskUtils
ModuleNotFoundError: No module named 'pycocotools'
Traceback (most recent call last):
File "/home/cs/anaconda3/envs/LD/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/cs/anaconda3/envs/LD/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/cs/anaconda3/envs/LD/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/cs/anaconda3/envs/LD/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/cs/anaconda3/envs/LD/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/ld/ld_r50_gflv1_r101_fpn_coco_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

运行环境:ubuntu 20.04

配置:
mmcv-full 1.2.7
torch 1.5.1
cuda 11.4

try
pip install -r requirements.txt

另外,只有一个gpu的话可以考虑用
python tools/train.py
代替/tools/dist_train.sh

非常感谢,将/tools/dist_train.sh换成python tools/train.py后就好了。