smeetrs/deep_avsr

Broken pipe error in Dataloader

smeetrs opened this issue · 3 comments

@lordmartian That change resolved the error, however, I had set numworkers in the config file to 0 in order to avoid a mutliprocess error that was masking the original error. Setting the numworkers to anything other than 0 still causes this error.

File "", line 1, in
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "D:\Users\arunm\anaconda3\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "D:\Users\arunm\anaconda3\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "D:\Users\arunm\anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\Users\arunm\PycharmProjects\AV_Speech_Recognition\audio_visual\pretrain.py", line 106, in
trainingLoss, trainingCER, trainingWER = train(model, pretrainLoader, optimizer, loss_function, device, trainParams)
File "D:\Users\arunm\PycharmProjects\AV_Speech_Recognition\audio_visual\utils\general.py", line 39, in train
for batch, (inputBatch, targetBatch, inputLenBatch, targetLenBatch) in enumerate(tqdm(trainLoader, leave=False, desc="Train", ncols=75)):
File "D:\Users\arunm\anaconda3\lib\site-packages\tqdm\std.py", line 1107, in iter
for obj in iterable:
File "D:\Users\arunm\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "D:\Users\arunm\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
w.start()
File "D:\Users\arunm\anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 46, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
Traceback (most recent call last):
File "D:/Users/arunm/PycharmProjects/AV_Speech_Recognition/audio_visual/pretrain.py", line 106, in
_check_not_importing_main()
File "D:\Users\arunm\anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
trainingLoss, trainingCER, trainingWER = train(model, pretrainLoader, optimizer, loss_function, device, trainParams)
File "D:\Users\arunm\PycharmProjects\AV_Speech_Recognition\audio_visual\utils\general.py", line 39, in train
for batch, (inputBatch, targetBatch, inputLenBatch, targetLenBatch) in enumerate(tqdm(trainLoader, leave=False, desc="Train", ncols=75)):
File "D:\Users\arunm\anaconda3\lib\site-packages\tqdm\std.py", line 1107, in iter
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
for obj in iterable:

File "D:\Users\arunm\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "D:\Users\arunm\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
w.start()
File "D:\Users\arunm\anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
reduction.dump(process_obj, to_child)
File "D:\Users\arunm\anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

Originally posted by @arunm95 in #9 (comment)

If you executed the code with num_workers=0 and if you didn't get any error, then that means the code is correct. If you search up this error on the internet, there are many posts related to it. According to these posts, this issue appears when running on Windows. One solution that seems to work for the majority of the people is to use if __name__ == "main" in the code. I do not have access to any Windows machine with GPU. Thus even if I make the suggested changes, I will not be able to try them out. I am open for any PRs which resolve this issue and are tested on Windows machines.

Confirming that this workaround does work. Wrapping the entire code of the pretrain.py under if name == 'main' resolves the issue.

Thanks. I'll make these changes in the next commit.