simpeg/pydiso

Release the GIL

mrocklin opened this issue ยท 5 comments

Hi ๐Ÿ‘‹

I was talking to some Dask users and they've experienced some sadness around SimPEG not releasing the GIL during execution. I think that this is the right repository in which to handle this.

For context, Dask workers often run user code (like SimPEG) in a separate thread, while also doing other stuff, like talking to other Dask workers. However when user code holds onto the GIL for a long time it stops Dask workers from being able to talk to each other, which makes all of the other workers a little nervous and sad about their friend who suddenly can't say anything.

Fortunately, fixing this by releasing the GIL is really easy especially if you're using f2py (which I think you are). All you have to do is add an !f2py threadsafe directive, as described in this StackOverflow question. This essentially tells Python that it can do other things while f2py runs some (potentially very long running) Fortran code in the background. You shouldn't do this if your Fortran code affects Python object state (which you almost certainly don't do).

This wouldn't just help Dask. It would help any use that wants to use this library in many threads, and is generally a kind and cool thing to do in the scientific Python world these days.

Hey, thanks for reaching out to us here.

In summary, most of the operations of the solvers are in fact not thread safe and we've likely been lazily abusing the GIL to create locks where we should be using threading.Lock. Most of this should happen within simpeg/pydiso.

There is a bit of f2py code here for interacting with mumps, but this is a very uncommonly used interface (in fact we do not distribute with it built) but I imagine in order to properly handle the locking issue would require a cython interface to mumps.

In summary, most of the operations of the solvers are in fact not thread safe and we've likely been lazily abusing the GIL to create locks where we should be using threading.Lock

Hrm, that's atypical. Is there some example code I can look at in order to learn more? My guess is that, even if there is shared state perhaps there are large sections where we can release the GIL. It looks like pydiso is mostly using Cython. Is that right? If so, perhaps there are a few strategic locations where we can add the with nogil: context manager to unblock things during larger sections. We could even do something like with lock, nogil: if that would make things feel safer. Holding onto a lock is fine as long it's not the global interpreter lock ๐Ÿ™‚ .

Yeah, pydiso is using Cython to interface with the intel MKL pardiso solver.
I believe this issue is more relevant there and we should probably transfer the issue to that repo.

Sounds good to me. I don't have permissions to do that, but I suspect that you (if you have rights to both repositories) do.

Maybe this would work? #7