sourcegraph/conc

Do we want a channel-based parallel processor?

sudhirj opened this issue · 2 comments

I've written https://github.com/sudhirj/cirque

It basically takes a processing function and gives you an input and output channel, and inputs sent to the input channel and processed in parallel with the results sent in order on the output channel. Very similar to Stream but off a channel API.

If we want to roll this into conc and based it off Stream I can raise a PR.

I generally shy away from using channels in a public API because they're easy to misuse. In the case of conc, I think it would be difficult to provide a channel-based API that can still maintain the same guarantees about scoping and cleanup.

Let's take the API of cirque as an example, where both an input and an output channel is returned:

  • How do we propagate panics? The only way we know the stream has completed is by waiting for a closed signal on the output channel.
  • How do we avoid requiring the user to spawn a separate goroutine to write to the input channel?
  • How do we guarantee that the input channel will always be closed? If the calling thread panics, if the user didn't defer close(input), the goroutines in the stream will block forever, leaking goroutines.
  • What if the caller returns early without consuming all messages from the output channel? Will the spawned goroutines block forever?

I'd like to keep the API of conc as hard to misuse as possible, and channels make that difficult. That said, even if difficult, it might not be impossible, so I'm willing to entertain proposals 🙂

That sounds reasonable, yeah, I don't think a channel API fits those constraints.