torchmd/torchmd-net

cannot set num_workers=0

Closed this issue · 0 comments

I sometimes get errors related to peristent workers + pinned memory, e.g.:

ataloader.py 1290 _get_data
raise RuntimeError('Pin memory thread exited unexpectedly')

RuntimeError:
Pin memory thread exited unexpectedly
[rank: 2] Child process with PID 1662142 terminated with code 1. Forcefully terminating all other processes to avoid zombies 🧟

One solution is to set num_workers=0
If I try and do this I get a different error:

dataloader.py 254 __init__
raise ValueError('persistent_workers option needs num_workers > 0')

ValueError:
persistent_workers option needs num_workers > 0

The solution is in:
#322