Getting RuntimeError

Question

Getting RuntimeError

lancioni opened this issue 4 years ago · 3 comments

When I try to reproduce experiments as stated, I invariably get an error:

RuntimeError: received 0 items of ancdata

The error is somewhat erratic, e.g. in train_zinc_subset.py I get it after 15 to 50 epoch in the first run. It seems to have to do with some kind of memory of other saturation I cannot detect.
Other examples in Python Geometric work just fine, so I think it is something specific to the code in himp-gnn.
I am using Python 3.6.9 under WSL2 Ubuntu 18.04.

Answer 1 · 2020-11-19T09:14:11.000Z

Adding torch.multiprocessing.set_sharing_strategy('file_system') at the top of the file is likely to fix this issue, see here. Not exactly sure what's causing it though.

Answer 2 · 2020-11-19T10:00:41.000Z

Thank you, it's working now!

Answer 3 · 2020-11-20T13:30:25.000Z

Another runtime error, again erratical and coming out when running train_zinc_full.py (after adding torch.multiprocessing.set_sharing_strategy('file_system')):

RuntimeError: unable to open shared memory object </torch_5535_1638182315> in read-write mode

Some discussion in other pytorch apps suggests it is a memory issue and advises to lower the number of workers. I have no idea if this is the issue, though.