open-mmlab weights are not loading

Question

open-mmlab weights are not loading

haritha91 opened this issue 3 years ago · 3 comments

Tried to run model on THUMOS14 and seems open-mmlab://i3d_r50_256p_32x2x1_100e_kinetics400_rgb having an issue with loading. Attached the error log for reference.

Answer 1 · 2021-10-14T20:39:31.000Z

2021-10-14 20:34:00,980 - vedatad - WARNING - EvalHook is not in modes ['train']
Traceback (most recent call last):
File "tools/trainval.py", line 65, in
2021-10-14 20:34:00,980 - vedatad - INFO - Loading weights from open-mmlab://i3d_r50_256p_32x2x1_100e_kinetics400_rgb
main()
File "tools/trainval.py", line 61, in main
trainval(cfg, distributed, logger)
File "/app/vedatad/vedatad/assembler/trainval.py", line 72, in trainval
looper.load_weights(**cfg.weights)
File "/app/vedatad/vedacore/loopers/base_looper.py", line 118, in load_weights
Traceback (most recent call last):
File "tools/trainval.py", line 65, in
load_weights(model, filepath, map_location, strict, self.logger,
File "/app/vedatad/vedacore/misc/checkpoint.py", line 278, in load_weights
main()
File "tools/trainval.py", line 61, in main
trainval(cfg, distributed, logger)
File "/app/vedatad/vedatad/assembler/trainval.py", line 72, in trainval
state_dict = _load_checkpoint(filepath, map_location)
File "/app/vedatad/vedacore/misc/checkpoint.py", line 143, in _load_checkpoint
looper.load_weights(**cfg.weights)
File "/app/vedatad/vedacore/loopers/base_looper.py", line 118, in load_weights
model_urls = get_open_mmlab_models()
File "/app/vedatad/vedacore/misc/checkpoint.py", line 34, in get_open_mmlab_models
load_weights(model, filepath, map_location, strict, self.logger,
model_urls = yaml.load(open(file_path))
File "/app/vedatad/vedacore/misc/checkpoint.py", line 278, in load_weights
TypeError: load() missing 1 required positional argument: 'Loader'
state_dict = _load_checkpoint(filepath, map_location)
File "/app/vedatad/vedacore/misc/checkpoint.py", line 143, in _load_checkpoint
model_urls = get_open_mmlab_models()
File "/app/vedatad/vedacore/misc/checkpoint.py", line 34, in get_open_mmlab_models
model_urls = yaml.load(open(file_path))
TypeError: load() missing 1 required positional argument: 'Loader'
Killing subprocess 391
Killing subprocess 392
Traceback (most recent call last):
File "/home/user/miniconda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/user/miniconda/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/user/miniconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/user/miniconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/user/miniconda/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/user/miniconda/bin/python', '-u', 'tools/trainval.py', '--local_rank=1', 'configs/trainval/daotad/daotad_i3d_r50_e700_thumos14_rgb.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

Answer 2 · 2021-10-15T08:40:25.000Z

Found the solution for this. yaml.load() without a Loader parameter has been depriciated. It should be added for vedacore/misc/checkpoint.py script as follows.

model_urls = yaml.load(open(file_path), Loader=yaml.FullLoader)

Answer 3 · 2021-11-03T02:50:06.000Z

Hiii, thanks a lot for finding the bug! Did you manage to load the weights to the model? I'm having key mismatching etc problems as described here #14 . Do you have the same experience?

Thanks!