FileNotFoundError: [Errno 2] No such file or directory
shuferhoo opened this issue · 2 comments
I run the Cart_Pole.py with A3C&A2C on linux and got the error.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/data/htx/git_study/agents/actor_critic_agents/A2C.py", line 19, in update_shared_model
new_grads = gradient_updates_queue.get()
File "/usr/local/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/htx/.env/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
fd = df.detach()
File "/usr/local/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/local/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Hava you solved this problem? I also got it in another code.
The reason for this seems to be explained in the multiprocessing documentation (https://docs.python.org/3.6/library/multiprocessing.html#pipes-and-queues), to quote:
"Warning As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe.
This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.
Note that a queue created using a manager does not have this issue. See Programming guidelines."
A potential solution is suggested here:
https://stackoverflow.com/questions/45866698/multiprocessing-processes-wont-join
So putting all of this together, the problem is with this bit:
Change line 26 to:
gradient_updates_queue = multiprocessing.Manager().Queue()
Depending on the complexity and volume of the results queue, it might also be prudent to make the same change to line 25, as well.
This would seem to fix the problem for A3C. I haven't tried A2C, yet, though.