LARC-CMU-SMU/FoodSeg103-Benchmark-v1

TypeError: __init__() got an unexpected keyword argument 'model_name' when trying to train

Opened this issue · 1 comments

Dear Foodseg Teams,

thank you for providing the resources for this great food segmentation tool.
I have a question. I am trying out the training on my system. However, I ran into a problem and would like to ask for assistance.

This is the exact command I am using:
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 --master_port=12901 tools/train.py --config configs/foodnet/SETR_Naive_768x768_80k_base.py --work-dir checkpoints_dir/SETR_Naive --launcher pytorch

This is what I get:

/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py:42: UserWarning: train_cfg and test_cfg is deprecated, please specify them in model
'please specify them in model', UserWarning)
Traceback (most recent call last):
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
return obj_cls(**args)
TypeError: init() got an unexpected keyword argument 'model_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/segmentors/encoder_decoder.py", line 35, in init
self.backbone = builder.build_backbone(backbone)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py", line 19, in build_backbone
return BACKBONES.build(cfg)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 210, in build
return self.build_func(*args, **kwargs, registry=self)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: VisionTransformer: init() got an unexpected keyword argument 'model_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 167, in
main()
File "tools/train.py", line 136, in main
test_cfg=cfg.get('test_cfg'))
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmseg/models/builder.py", line 48, in build_segmentor
cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 210, in build
return self.build_func(*args, **kwargs, registry=self)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: EncoderDecoder: VisionTransformer: init() got an unexpected keyword argument 'model_name'
Traceback (most recent call last):
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in
main()
File "/home/cszsolnai/anaconda3/envs/open-mmlab2/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/cszsolnai/anaconda3/envs/open-mmlab2/bin/python', '-u', 'tools/train.py', '--local_rank=0', '--config', 'configs/foodnet/SETR_Naive_768x768_80k_base.py', '--work-dir', 'checkpoints_dir/SETR_Naive', '--launcher', 'pytorch']' returned non-zero exit status 1.

It looks maybe like some of the packages have incompatible versions? I am not sure.

Here are details of my platform:

Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0: NVIDIA GeForce GTX 1080
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.6.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.3-Product Build 20210617 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.2

Package versions:

mmcv-full=1.3.10
cudatoolkit=10.1

@kanesoban Sorry for replying late. The mmcv-full version I use is 1.2.6, and it will raise error with 1.3.10 in my side. This is because the official repo of mmcv has been updated, and I will refine the instruction. BTW, can u check whether you have successfully installed software by running:

from mmseg.apis import inference_segmentor, init_segmentor

and install all dependencies in requirements.txt? Since there is no issue to run your example script in my side, I suspect the package version is the issue. The version I use is:

pytorch: 1.6.0
mmcv-full: 1.2.6
cudatoolkit: 10.2