After IO.blocking(...), execution stays on blocking thread pool
THeinemann opened this issue ยท 3 comments
Hi cats-effect team,
while debugging a program, I found that, when executing some of the IOs in the blocking thread pool, the IOs that follow in the execution (using flatMap) also are executed on the blocking thread pool.
For example:
import cats.effect.IO
import cats.effect.unsafe.implicits.global
import scala.concurrent.duration.DurationInt
{
for {
_ <- IO {
println(0, Thread.currentThread())
}
_ <- IO.blocking {
println(1, Thread.currentThread())
}
_ <- IO {
println(2, Thread.currentThread())
}
_ <- IO.sleep(2.millis)
_ <- IO {
println(3, Thread.currentThread())
}
} yield ()
}.unsafeRunSync()
Prints out this for me:
(0,Thread[io-compute-2,5,main])
(1,Thread[io-compute-blocker-2,5,main])
(2,Thread[io-compute-blocker-2,5,main])
(3,Thread[io-compute-5,5,main])
I would have expected that the third line (with number 2) would also print a thread name from the compute pool (i.e. without blocker in its name).
This seems unexpected to me, especially considering the example in https://typelevel.org/cats-effect/docs/thread-model#blocking (which prints "current pool" after the blocking execution is finished, implying that the execution switched back to the previous pool).
I tested it with cats-effect 3.53 and 3.5.4.
Note that when using interruptible instead of blocking, the example works fine for me.
I think it works as intended. Sometime Fiber doesn't switch thread pool for optimization reason: https://typelevel.org/cats-effect/docs/faq#why-is-my-io-running-on-a-blocking-thread
This is indeed working as intended. :) Basically there's a tradeoff here: if we proactively and immediately shift back from the blocker to the compute worker, we can guarantee you never see compute work on the blocking subpool, but that cost might be paid in vain if you just go right back to blocking
again (which commonly happens). So the pool is tuned to optimize this common case of repeated blocking
actions (separated by flatMap
s and a few maps
) by leaving the work on the blocker. This does result in some contention in the kernel-level scheduler, but practical tests suggest the cost of that contention is lower than the savings of avoiding the unnecessary context shift (and subsequent shift back).
In the worst case scenario, where you just have a single blocker
and then don't block again for a while, the pool will shift the fiber back at worst when you hit the auto-cede boundary (by default, once every 1024 IO
stages), but it's likely you'll be shifted back before then by hitting some sort of asynchronous suspension (like sleep
in your example).
Thanks for the explanations!