rtomayko/tilt

Segmentation Fault during rails assets:precompile

Closed this issue · 13 comments

This past week I've been noticing random segmentation faults during the asset:precompile stage of my heroku deploys. I don't yet know the underlying cause, but I know that if I freeze tilt at 1.3.3, the problem goes away. I couldn't find anyone else having the same problem, then I realized that 1.3.4 was cut just a couple weeks ago.

I'm using ruby 2.0.0p0, and a few asset gems:

group :assets do
  gem 'jquery-rails'
  gem 'sass-rails'
  gem 'font-awesome-sass-rails' 
  gem 'bootstrap-sass', '~> 2.3'
  gem 'bourbon'
  gem 'neat'
  gem 'uglifier'
  gem 'coffee-script'
end

I get random segmentation faults during rake assets:precompile that look something like this:

/Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/uglifier-1.3.0/lib/uglifier.rb:65: [BUG] Segmentation fault
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-darwin12.2.0]

c:0050 p:0055 s:0230 e:000225 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/uglifier-1.3.0/lib/uglifier.rb:65
c:0049 p:0011 s:0221 e:000220 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/actionpack-3.2.12/lib/sprockets/compressors.rb:74
c:0048 p:0010 s:0217 e:000216 BLOCK  /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/processing.rb:265 [FINISH]
c:0047 p:---- s:0213 e:000212 CFUNC  :call
c:0046 p:0016 s:0208 e:000207 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/processor.rb:29
c:0045 p:0034 s:0203 e:000202 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/tilt-1.3.4/lib/tilt/template.rb:77
c:0044 p:0025 s:0197 E:000c50 BLOCK  /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/context.rb:193 [FINISH]

I know this isn't a very helpful issue report, but I'm hoping more people will run into this problem and find this issue.

This is very interesting.

We got a segfault on Travis on 2.0.0p0 here: https://travis-ci.org/rtomayko/tilt/jobs/5479138

But a few minutes later (after I pushed CHANGELOG change) it's green: https://travis-ci.org/rtomayko/tilt/jobs/5479184

uglifier.rb:65 doesn't make me any wiser either.

@edlebert Could you dump the rest of the report (with the C level backtrace)?

The reason why this could have started in Tilt 1.3.4 was because we started always compiling templates to a method. In 1.3.3 we did a plain instance_eval first and then invoked the compiled method on subsequent renderings, while in 1.3.4 we never instance_eval. I'd guess that even in 1.3.3 it might be able to get the segfault (if e.g. it's possible to trigger while your Sinatra runs).

I've opened an issue over at bugs.ruby-lang.org related to the other segfault: http://bugs.ruby-lang.org/issues/8100

I did a bunch of trial/error work on this and I found out it only happened if I did some file IO during javascript precompiling on rails. Specifically, I was exporting a bunch of ActiveSupport::TimeZone data to javascript. I found that even if I had a js.erb file that contained only this line (note that no javascript code is actually being created here), I would get the segmentation fault during rake assets:precompile.:

<% ActiveSupport::TimeZone.all %>

As a workaround, I had a hunch that if I pre-loaded the timezone data into memory before the asset precompile, I wouldn't get a segfault. So I created a rails initializer that simply pre-reads all the timezone data:

ActiveSupport::TimeZone.all

And presto, no more segmentation faults :)

rkh commented

Might be related to the segfault we sometimes seen in sinatra-contrib's content_for implementation on 2.0. @zzak is investigating.

Below is the smallest and most isolated script I've so far been able to produce that suffers from the segfault. I've found that as long as Tilt is rendering anything, it can segfault. I've observed that the more you do during rendering (e.g. loading files), the higher the odds of a segfault.

require 'tilt'
run(proc do
  body = Tilt['str'].new{'Hello world #{test = ["1", "2"] + ["3"] }'}.render
  [200, {}, [body]]
end)

Segfaults are very rare under this test though due to the simplicity of the the script; very little happens while rendering. Out of 1,162,202 HTTP requests at ~500 requests per second, I got 37 segfaults. I've done the same test at 1,003,356 requests without Tilt, just using ERB, and didn't get any segfaults. Only a few timeouts.

So it does indeed seem to be related to something Tilt is doing. It also appears to be time bound. If I throw 40 threads at the problem, even though I can push through more requests per unit of time, it seems to give the same segfault rate. Weird. I'm not sure how to proceed narrowing it down further. I could spend hours stripping down Tilt, but it may not reveal anything.

How are you running the test? Can you show us the config.ru? What server do you use?

Magnus Holm

On Fri, Mar 22, 2013 at 7:28 AM, Tom Wardrop notifications@github.com
wrote:

Below the smallest and most isolated script I've so far been able to produce, which suffers from the segfault. I've found that as long as Tilt is rendering anything, it can segfault. I've observed that the more you do during rendering (e.g. loading files), the higher the odds of a segfault.
require 'tilt'
run(proc do
body = Tilt['str'].new{'Hello world #{test = ["1", "2"] + ["3"] }'}.render
[200, {}, [body]]
end)
Segfaults are very rare under this test though due to the simplicity of the the script; very little happens while rendering. Out of 1,162,202 HTTP requests at ~500 requests per second, I got 37 segfaults. I've done the same test at 1,003,356 requests without Tilt, just using ERB, and didn't get any segfaults. Only a few timeouts.

So it does indeed seem to be related to something Tilt is doing. It also appears to be time bound. If I throw 40 threads at the problem, even though I can push through more requests, per minute of work, it seems to give the same segfault rate. Weird. I'm not sure how to proceed narrowing it down further. I could spend hours stripping down Tilt, but it may not reveal anything.

Reply to this email directly or view it on GitHub:
#179 (comment)

I've been able to reduce it to this code which segfaults in 2.0.0-p0. Testing trunk now.

class Fail
  def render(scope = Object.new)
    compiled_method.bind(scope).call
  end

  def compiled_method
    @compiled_method ||= compile_template_method
  end

  def source
    "Hello world".inspect
  end

  def compile_template_method
    method_name = "__tilt_#{Thread.current.object_id.abs}"
    Object.class_eval("def #{method_name}; #{source} end")
    unbind_compiled_method(method_name)
  end

  def unbind_compiled_method(method_name)
    method = Object.instance_method(method_name)
    Object.class_eval { remove_method(method_name) }
    method
  end
end

loop do
  Fail.new.render
end

@judofyr that code block is the config.ru :)

I did intend to post the test script though, sorry about that. Here it is:

require 'peach'
require 'net/http'

results = {}
10 000 000.times.peach(12) do
  key = begin
    Net::HTTP.get_response(URI('http://localhost:3000/')).code
  rescue => e
    e.class
  end
  results[key] ||= 0
  results[key] += 1
  STDOUT.write "\r#{results.inspect}"
end
puts

The peach gem provides the peach method which parallelises the requests; 12 parallel threads are allowed in this case.

@judofyr By the way, nice work on reducing it down. That's awesome. In hindsight, I have know no idea why I didn't try looping tilt within a single process. I guess I assumed the whole stack played a part and went with the brute-force over HTTP approach.

zzak commented

@judofyr Yup that fixed it for me too!!