tricktreat/piqn

【multiprocessing】

kk19990709 opened this issue · 2 comments

Traceback (most recent call last):
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/data/kk/GEEK/piqn/piqn.py", line 14, in __train
    trainer.train(train_path=run_args.train_path, valid_path=run_args.valid_path,
  File "/data/kk/GEEK/piqn/piqn/piqn_trainer.py", line 164, in train
    if args.local_rank != -1:
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 612, in to
    return self._apply(convert)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 381, in _apply
    param_applied = fn(param)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 610, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

ctx = mp.get_context('fork')

Please try changing ctx = mp.get_context('fork') to ctx = mp.get_context('spawn').

Thx!