Busy patches will eventually lock up

Question

Busy patches will eventually lock up

Closed this issue 8 years ago · 2 comments

This is a good one.

I created a pretty large performance patch with a [mix8bus 8] and several of each of the utility patches from the bgribble/mfp-patches repo. It worked fine for about 15 minutes, then locked up.

Turns out after some probing that at some point we get a rapid explosion of new threads, due to a logjam of RPC requests that never complete.

A few learnings from the debugging so far:

In a significant working patch, the bulk of the RPC traffic appears to be dsp_response objects from [snap~] objects used to feed level meters. There are 18 meters in my performance patch, about 10Hz updates for each, so 180 requests/sec just to show meters.

Answer 1 · 2017-01-03T15:33:24.000Z

I've found 2 problems so far:

A memory leak in RPCHost where Request objects, once added to rpc_host.pending, are never removed. That's pretty bad and would eventually kill a long-running program but I don't think it's the immediate problem
A race (I believe) in Processor._send which can cause a single processor to deadlock. This cascades to lock up the whole patch since every incoming dsp_response is queued up by a separate worker so we just use up the whole pool waiting for one processor

Answer 2 · 2017-01-05T19:48:21.000Z

There were a couple of races happening. The killer was in the Request constructor, where a sequential ID was assigned to the requests but access/increment was not reentrant :(

There's still a small memory leak in mfpdsp somewhere, but this commit fixes the problems that this ticket is describing.