MWATelescope/manta-ray-client

mwa_client hangs under certain conditions

johnsmorgan opened this issue · 2 comments

I run mwa_client as part of pipeline in the following way. Each mwa_client is responsible for downloading only a single obsid. I generate a csv file containing a single line and run manta-ray as follows
mwa_client -c [csvfile] -d .
The mwa_client should then idle until either the file is processed and downloaded in which case it returns 0 and my pipeline knows to continue, or returns non-0 in which case my pipeline knows that something went wrong.

Unfortunately under certain circumstances, mwa_client will not exit once the download is completed but instead will hang indefinitely. I have observed this to happen only when the client receives a message about another asvo job of mine: either that the other job has been canceled using the online interface or that the other job has expired (~1 week after scheduling).

This behaviour is not always observed when such messages come through.

I think I have observed this more under python3 than python2 though I'm pretty sure I've seen it in both cases.

I've had my pipeline running unsupervised for almost a month and this issue does not seem to have occurred during that time so I'm closing this for now.