yuantn/MI-AOD

HI, maybe a small obstacle for the many, but what went wrong,can you help ? thanks

Jackyinuo opened this issue · 3 comments

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)

2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)

2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/

For reference only.

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)
2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/

For reference only.

thank you so much, I updated Python 3.8. The original files are in Python 3.7

cycles = [0, 1, 2, 3, 4, 5, 6]
work_directory = './work_dirs/MIAL'
gpu_ids = range(0, 1)
2021-03-16 11:48:19,160 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,192 - mmdet - INFO - Set random seed to 666, deterministic: False
2021-03-16 11:48:19,539 - mmdet - INFO - load model from: torchvision://resnet50
2021-03-16 11:48:19,698 - mmdet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2021-03-16 11:48:24,285 - mmdet - INFO - Start running, host: phzhou@ubuntu-MW51-HP0-00, work_dir: /disk1/huihui/MIAL/work_dirs/MIAL/20210316_114818
2021-03-16 11:48:24,285 - mmdet - INFO - workflow: [('train', 1)], max: 3 epochs
Traceback (most recent call last):
File "./tools/train.py", line 267, in
main()
File "./tools/train.py", line 180, in main
distributed=distributed, validate=(not args.no_validate), timestamp=timestamp, meta=meta)
File "/disk1/huihui/MIAL/mmdet/apis/train.py", line 120, in train_detector
runner.run(data_loaders_L, cfg.workflow, cfg.total_epochs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/disk1/huihui/MIAL/mmdet/models/detectors/base.py", line 228, in train_step
losses = self(**data)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/disk1/huihui/MIAL/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'
Traceback (most recent call last):
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/phzhou/anaconda3/envs/mial/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/phzhou/anaconda3/envs/mial/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/MIAL.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

I also had this problem and my solution was to make sure that the epoch_base_runner.py file in your code was copied to the appropriate place. In README there is a statement cp-v epoch_base_runner.py ~balabala/runner/
For reference only.

thank you so much, I updated Python 3.8. The original files are in Python 3.7

Yes, just as @xiaosa96 mentioned, if you have modified anything in the mmcv package (including but not limited to: updating/re-installing Python, PyTorch, mmdetection, mmcv, mmcv-full, conda environment), you are supposed to copy the epoch_based_runner.py provided in this repository to the mmcv directory again (as described in the installation.md).