fastmachinelearning/gw-iaas

`BrokenPipeError` in `hermes.stillwater.process.PipelineProcess`

Opened this issue · 0 comments

Trying to close a hermes.stillwater.process.PipelineProcess too quickly will cause a BrokenPipeError due to the self.in_q.close() call in the __exit__ method due to python/cpython#80025. This should be fixed and backported since python/cpython#31913, but I'm not sure most releases have had time to implement it yet, so for now it might be worth inserting a time.sleep as the linked issue suggests to avoid this error. Once the fix is implemented, I don't think we'll need to clear the queue manually anymore (if we ever did...).

Incidentally, we're also not closing the out_q of PipelineProcess objects during __exit__. This is probably related to the asynchronous communication of processes, i.e. if we're __exit__ing due to an error, we'd like to raise this error rather than try to do a put on a closed q and get a ValueError that we don't know whether to catch or raise because it's a real problem. I think it makes sense to close all the qs and then do some logic around

  1. Why we're __exit__ing (is there an error, and was it raised by this process?)
  2. Why a get or a put on a q might cause a ValueError (are we stopped now? Or should we always assume that this is due to another process exiting and trust that users won't manually close these qs accidentally?)