sbt/sbt-bintray

Plugin creates an unreasonable amount of "New I/O worker #NN" threads

retronym opened this issue · 6 comments

Discussion continuing from sbt/sbt#3962

sbt-bintray uses dispatch, which uses AsyncHttpClient, which uses Netty. The combination of usage pattern and defaults within this stack seems to result in a large number of backgrounds threads (~150 in the sample build I looked at sbt/zinc) laying in wait for some HTTP traffic to mediate. I

This is undesirable on a few counts: each of these threads consumes a little memory for the stack and they make it a little harder to scroll through thread dumps of SBT.

My expectation is that the plugin should not consume resources until it first does some work, at which point it should only create a number of threads in the range of 0-2x the CPU count. Ideally, these threads would be removed from the pool after a period of inactivity.

Okay, this seems likely to come down to a semantic change in s.c.concurrent.TrieMap between 2.10 and 2.12, introduced in scala/scala#4319. The thunk passed to getOrElseUpdate can be called more than under concurrent calls, which seems to be exactly what happens.

We can add some manual synchronization around the call to cachedRepo.getOrElseUpdate to avoid this problem. That will bring us down the default of 16 threads (which is pretty high IMO but much better than 153!)

For sbt-pgp, I fixed the problem by migrating to Gigahorse - sbt/sbt-pgp@11963a2

Let me know if there's anything I can do from the Dispatch side. I've seen this kind of issue before, but we don't really layer much special on top of AHC, so I suspect the root bug is there. I can't see what version of Dispatch you're using, but our latest is 0.14 and AHC has been doing a lot of house cleaning. You might also get some boost by upgrading.

The default behaviour in dispatch is to create a new AHC client for val http = Http(). That creates 2x$numcpus threads until http.shutdown() is called.

The obvious thing a client might do here is share a Http instance across the application. The current docs for Dispatch didn't tell me if this Http is thread safe or not. I know it wasn't in dispatch-classic, but it looks plausibly thread safe now. AHC itself threadsafe under the proviso that the user take responsibility for not using a non-threadsafe instance of a callback function across threads. I had trouble following through the code in dispatch to see if this was a problem or a not.

So the action is to determine and document thread-safety, and at the same time document the lifecycle of the threads and the need to call shutdown(). Side node: It's a bit non-standard to call that method shutdown rather than implementing Closable.

The other awkward part about the dispatch API is in customizing the configuration of AHC while retaining any defaults that dispatch starts with. AFAICT you can do only do with:

val customHttp = {
  val prototypeHttp = Http()
  try {
    http.configure(builder => builder.setIOThreadMultiplier(1))
  } finally prototypeHttp.shutdown()
}

That's clunky and wasteful (it creates threads just to destroy them)

Perhaps it makes sense to expose InternalDefaults.client (currently only publicly available as the default argument of Http) as Http.defaultClient.

I'll reply to each point to the best of my ability without having the code open in front of me this moment...

The default behaviour in dispatch is to create a new AHC client for val http = Http(). That creates 2x$numcpus threads until http.shutdown() is called.

The obvious thing a client might do here is share a Http instance across the application. The current docs for Dispatch didn't tell me if this Http is thread safe or not. I know it wasn't in dispatch-classic, but it looks plausibly thread safe now. AHC itself threadsafe under the proviso that the user take responsibility for not using a non-threadsafe instance of a callback function across threads. I had trouble following through the code in dispatch to see if this was a problem or a not.

So the action is to determine and document thread-safety, and at the same time document the lifecycle of the threads and the need to call shutdown(). Side node: It's a bit non-standard to call that method shutdown rather than implementing Closable.

I can do some digging to determine if we permit this. I suspect this behavior is just a carry-over from the dispatch-classic days, but I will admit that it does make me a bit leery for two Http instances to share some hidden state between them... at the very least, though, there's a good case for making the thread behavior of a particular instance easier to tweak.

The other awkward part about the dispatch API is in customizing the configuration of AHC while retaining any defaults that dispatch starts with.

I fixed this wart a few months ago in 0.13.x. Once you upgrade to that version or higher, Http.withConfiguration will do what you want. It'll pass in the default builder and you can tweak it to your heart's content before we spin up the actual Http instance.