ruby-concurrency/concurrent-ruby

Memory leak in Concurrent::Future

leoarnold opened this issue · 2 comments

* Operating system:                linux
* Ruby implementation:             MRI 2.6.6, 2.7.4, 3.1.2
* `concurrent-ruby` version:       1.1.10
* `concurrent-ruby-ext` installed: no
* `concurrent-ruby-edge` used:     no

When passing a block to Concurrent::Future.execute and waiting for the result, the Future can be garbage collected, but the return value of the block seems to stay in memory forever, as demonstrated by the following script:

# frozen_string_literal: true

require 'bundler/inline'

gemfile(true) do
  source 'https://rubygems.org'

  gem 'concurrent-ruby', '1.1.10'
end

class Thing; end

def block_while_things_in_memory(title)
  i = 0

  until ObjectSpace.each_object(Thing).count == 0
    i += 1
    puts "#{title} - #{i}. GC - #{ObjectSpace.each_object(Thing).count} Things, #{ObjectSpace.each_object(Concurrent::Future).count} Futures"
    GC.start
  end

  puts "#{title}: No more Things in memory"
end

block_while_things_in_memory('Initial cleanup')

Concurrent::Future.execute do
  Thing.new
  nil
end.wait

block_while_things_in_memory('When block returns nil')

Concurrent::Future.execute do
  Thing.new
end.wait

block_while_things_in_memory('When block returns a Thing')

When running the script on several different versions of MRI, I always get something like this never ending output:

Fetching gem metadata from https://rubygems.org/..
Resolving dependencies...
Using bundler 2.3.7
Using concurrent-ruby 1.1.10
Initial cleanup: No more Things in memory
When block returns nil - 1. GC - 1 Things, 1 Futures
When block returns nil: No more Things in memory
When block returns a Thing - 1. GC - 1 Things, 1 Futures
When block returns a Thing - 2. GC - 1 Things, 0 Futures
When block returns a Thing - 3. GC - 1 Things, 0 Futures
When block returns a Thing - 4. GC - 1 Things, 0 Futures
When block returns a Thing - 5. GC - 1 Things, 0 Futures
When block returns a Thing - 6. GC - 1 Things, 0 Futures
When block returns a Thing - 7. GC - 1 Things, 0 Futures
When block returns a Thing - 8. GC - 1 Things, 0 Futures
When block returns a Thing - 9. GC - 1 Things, 0 Futures
When block returns a Thing - 10. GC - 1 Things, 0 Futures
When block returns a Thing - 11. GC - 1 Things, 0 Futures
When block returns a Thing - 12. GC - 1 Things, 0 Futures
When block returns a Thing - 13. GC - 1 Things, 0 Futures
When block returns a Thing - 14. GC - 1 Things, 0 Futures
When block returns a Thing - 15. GC - 1 Things, 0 Futures
When block returns a Thing - 16. GC - 1 Things, 0 Futures
When block returns a Thing - 17. GC - 1 Things, 0 Futures
When block returns a Thing - 18. GC - 1 Things, 0 Futures
When block returns a Thing - 19. GC - 1 Things, 0 Futures
When block returns a Thing - 20. GC - 1 Things, 0 Futures
When block returns a Thing - 21. GC - 1 Things, 0 Futures
When block returns a Thing - 22. GC - 1 Things, 0 Futures
When block returns a Thing - 23. GC - 1 Things, 0 Futures
When block returns a Thing - 24. GC - 1 Things, 0 Futures
When block returns a Thing - 25. GC - 1 Things, 0 Futures
When block returns a Thing - 26. GC - 1 Things, 0 Futures
When block returns a Thing - 27. GC - 1 Things, 0 Futures
When block returns a Thing - 28. GC - 1 Things, 0 Futures
When block returns a Thing - 29. GC - 1 Things, 0 Futures
...

It turns out that there is way more leaking than just the return value of the block:

# frozen_string_literal: true

require 'bundler/inline'

gemfile(true) do
  source 'https://rubygems.org'

  gem 'concurrent-ruby', '1.1.10'
  gem 'memory_profiler', '~> 1'
end

class Thing; end

def report(title, &block)
  puts title

  pp MemoryProfiler.report(&block).retained_memory_by_class
end


report('Warmup') do
  Concurrent::Future.execute { Thing.new }.wait
end

report('When waiting for the Future') do
  Concurrent::Future.execute { Thing.new }.wait
end

report('When waiting for the Future and actively dereferencing it') do
  x = Concurrent::Future.execute { Thing.new }.wait
  x = nil
end

yields the output:

Fetching gem metadata from https://rubygems.org/..
Resolving dependencies...
Using bundler 2.3.19
Using memory_profiler 1.0.0
Using concurrent-ruby 1.1.10
Warmup
[{:data=>"Thread", :count=>1048992},
 {:data=>"Array", :count=>240},
 {:data=>"Concurrent::CachedThreadPool", :count=>216},
 {:data=>"Thread::Mutex", :count=>216},
 {:data=>"Thread::ConditionVariable", :count=>192},
 {:data=>"Concurrent::Event", :count=>144},
 {:data=>"Proc", :count=>80},
 {:data=>"String", :count=>80},
 {:data=>"Thread::Queue", :count=>76},
 {:data=>"Concurrent::RubyThreadPoolExecutor::Worker", :count=>40},
 {:data=>"Thing", :count=>40}]
When waiting for the Future
[{:data=>"Array", :count=>80}, {:data=>"Thing", :count=>40}]
When waiting for the Future and actively dereferencing it
[{:data=>"Array", :count=>80}, {:data=>"Thing", :count=>40}]

Maybe this "leaking" of the Array objects is intentional, i.e. keeping a pool of sub-threads in a thread-local variable which is then garbage collected when the parent thread is garbage collected 🤔