Multiprocessing error on Windows

Question

Multiprocessing error on Windows

sjscotti opened this issue 5 years ago · 2 comments

Hi!
I am anxious to experiment with your RBPN code. I have downloaded onto a Windows 10 machine with a GPU, and installed Python 3.5, PyTorch 1.0.1, and Pyflow dependencies (plus any lower order functions that were missing). In running the eval.py script, I get the following error (I needed to interrupt the stalled process at the end)...

Namespace(chop_forward=False, data_dir='./Vid4', file_list='foliage.txt', future_frame=True, gpu_mode=True, gpus=1, model='weights/RBPN_4x.pth', model_type='RBPN', nFrames=7, other_dataset=True, output='Results/', residual=False, seed=123, testBatchSize=1, threads=1, upscale_factor=4)
===> Loading datasets
===> Building model RBPN
Pre-trained SR model is loaded.
Namespace(chop_forward=False, data_dir='./Vid4', file_list='foliage.txt', future_frame=True, gpu_mode=True, gpus=1, model='weights/RBPN_4x.pth', model_type='RBPN', nFrames=7, other_dataset=True, output='Results/', residual=False, seed=123, testBatchSize=1, threads=1, upscale_factor=4)
===> Loading datasets
===> Building model RBPN
Pre-trained SR model is loaded.
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 115, in _main
prepare(preparation_data)
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 226, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 278, in _fixup_main_from_path
run_name="mp_main")
File "C:\Program Files\Python35\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Program Files\Python35\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Program Files\Python35\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Steve\Downloads\RBPN-PyTorch-master\RBPN-PyTorch-master\eval.py", line 182, in
eval()
File "C:\Users\Steve\Downloads\RBPN-PyTorch-master\RBPN-PyTorch-master\eval.py", line 79, in eval
for batch in testing_data_loader:
File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _DataLoaderIter(self)
File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "C:\Program Files\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Python35\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Program Files\Python35\lib\multiprocessing\popen_spawn_win32.py", line 34, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 144, in get_preparation_data
_check_not_importing_main()
File "C:\Program Files\Python35\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Traceback (most recent call last):
File "eval.py", line 182, in
eval()
File "eval.py", line 79, in eval
for batch in testing_data_loader:
File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 631, in next
idx, batch = self._get_batch()
File "C:\Program Files\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 610, in _get_batch
return self.data_queue.get()
File "C:\Program Files\Python35\lib\multiprocessing\queues.py", line 94, in get
res = self._recv_bytes()
File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 306, in _recv_bytes
[ov.event], False, INFINITE)
KeyboardInterrupt

I am not very familiar with Python, but from trying to understand this error, it appears that it may be unique to Windows because of the way it spawns off processes compared to Linux. See note under torch.utils.data.DataLoader here...

https://pytorch.org/docs/stable/data.html?highlight=dataloader%20py#torch.utils.data.DataLoader

The suggested correction is to include checks using
if __name__ == '__main__':
in the eval.py code at the appropriate places. When that is done correctly, the code should work correctly on both Windows and Linux. I experimented with adding this check to several locations in the code, but was unsuccessful to get a complete run. Can you suggest how to modify the code to work on Windows or another change that would allow it to run?

Thanks
-Steve

Answer 1 · 2019-08-03T02:09:35.000Z

Hi!
I found that this problem was easy to fix. You just took the couple of lines in eval.py and replaced it with this:

##Eval Start!!!!
if name == 'main':
eval()

I have another question that I will ask in another thread.

Answer 2 · 2021-10-21T16:08:02.000Z

try the following "eval.py":

def eval():
    model.eval()
    count=1
    avg_psnr_predicted = 0.0
    for batch in testing_data_loader:
        input, target, neigbor, flow, bicubic = batch[0], batch[1], batch[2], batch[3], batch[4]

if __name__ == '__main__':	
        
        with torch.no_grad():