ruby-concurrency/concurrent-ruby

Remove configuration of global pools

pitr-ch opened this issue · 10 comments

  • keep them lazy initialized
  • rename: task -> short-non-blocking, operation -> long-blocking
  • add more global executors like an immediate executor
  • remove configuration object, move global pools to Concurrent space
  • possibly improve parsing of :executor option
  • to also handle agent two pools
  • to understand :immediate too
  • etc.
  • what to do with logging?

@chrisseaton @jdantonio, I've tried to capture the result of the discussion.

I like the renames. I think task / operation could also be the other way around, so the new names are better.

I don't know anything about logging, but a general hook mechanism might come in very handy in the future. Perhaps abstract hooking from logging. Then we could have things like concurrency profiling.

@chrisseaton there is simple general logging added, I needed something for actors. It's just a lambda taking level, progname, message, &block so it can be hooked to any logging lib. See https://github.com/ruby-concurrency/concurrent-ruby/blob/master/lib/concurrent/logging.rb.

What type of hooks do you have in mind?

Something that gets a notification when an actor is created, when a message is sent, when it's processed. That sort of thing. But I don't really know much about actor models so it was just an off-hand remark really.

It was a good remark I am logging level very similar events on DEBUG level. It would be better to make it general not just for logger as you suggested. TODO made.

  • 👍 to the suggested names for the global thread pools.
  • I'm not opposed to adding more global executors, but I wonder how often they would get used. If a user needs an executor with different characteristics they can create one and pass it into the appropriate abstraction. Is having a global ImmediateExecutor that much more convenient than creating on at runtime?
  • Similar to the renaming, I'm fine moving the global thread pools up to the Concurrent namespace.
  • I'm happy to add more robust parsing of the :executor option. That's why I created the parsing function in the first place. As a Rubyist I love simple-yet-expressive options like :immediate.
  • Logging is something that I am always wary of. Global logging capabilities tend to spread through the application like a virus and create hidden side effects (Rails being the perfect example). The approach taken for Actor is nice because it is very unintrusive. I'm not sure that integrated logging makes as much sense for many of the simpler abstractions, however.

I mentioned ImmediateExecutor because it is a good candidate since all of it's instances are same it's basically singleton. We can safe few GC cycles (I know it's a micro optimization). As you suggest it probably does not make sense for any other executor.

We could also rename all pools to executor for consistency as well.

logging: agree, fyi I've added few debug log calls also to other parts of the gem to log otherwise ignored exceptions. (Look for log method calls.)

We should determine what configuration should be chosen for global pools.

Currently we have:

    def new_task_pool
      Concurrent::ThreadPoolExecutor.new(
          min_threads:     [2, Concurrent.processor_count].max,
          max_threads:     [20, Concurrent.processor_count * 15].max,
          idletime:        2 * 60, # 2 minutes
          max_queue:       0, # unlimited
          overflow_policy: :abort # raise an exception
      )
    end

    def new_operation_pool
      Concurrent::ThreadPoolExecutor.new(
          min_threads:     [2, Concurrent.processor_count].max,
          max_threads:     [2, Concurrent.processor_count].max,
          idletime:        10 * 60, # 10 minutes
          max_queue:       [20, Concurrent.processor_count * 15].max,
          overflow_policy: :abort # raise an exception
      )
    end

Since the operation_pool is intended for longer and blocking tasks it may deadlock easily, e.g. when all threads are waiting for an event scheduled for execution in queue and there is no free thread. I would increase the max_threads of the operation pool. What do you think and what is your experience?

I am thinking about following for the new configuration:

module Concurrent

  # executors do not allocate the threads immediately so they can be constants
  # all thread pools are configured to never reject the job
  # TODO optional auto termination
  module Executors

    IMMEDIATE_EXECUTOR = ImmediateExecutor.new

    # Only non-blocking and short tasks can go into this pool, otherwise it can starve or deadlock
    FAST_EXECUTOR      = Concurrent::FixedThreadPool.new(
        [2, Concurrent.processor_count].max,
        idletime:  60, # 1 minute same as Java pool default
        max_queue: 0 # unlimited
    )

    # IO and blocking jobs should be executed on this pool
    IO_EXECUTOR        = Concurrent::ThreadPoolExecutor.new(
        min_threads: [2, Concurrent.processor_count].max,
        max_threads: Concurrent.processor_count * 100,
        idletime:    60, # 1 minute same as Java pool default
        max_queue:   0 # unlimited
    )

    def executor(which)
      case which
      when :immediate, :immediately
        IMMEDIATE_EXECUTOR
      when :fast
        FAST_EXECUTOR
      when :io
        IO_EXECUTOR
      when Executor
        which
      else
        raise TypeError
      end
    end
  end

  extend Executors
end

What do you think?

Thank you @pitr-ch for this. The original configuration was very arbitrary based on a few internet searches I did. It was not based on actual use of the gem. Now that we're growing a community of users I agree that we should evolve the global configuration to support them better. I am comfortable with this configuration.

Addressed in PR #255.