Segmentation Fault during rails assets:precompile
Closed this issue · 13 comments
This past week I've been noticing random segmentation faults during the asset:precompile stage of my heroku deploys. I don't yet know the underlying cause, but I know that if I freeze tilt at 1.3.3, the problem goes away. I couldn't find anyone else having the same problem, then I realized that 1.3.4 was cut just a couple weeks ago.
I'm using ruby 2.0.0p0, and a few asset gems:
group :assets do
gem 'jquery-rails'
gem 'sass-rails'
gem 'font-awesome-sass-rails'
gem 'bootstrap-sass', '~> 2.3'
gem 'bourbon'
gem 'neat'
gem 'uglifier'
gem 'coffee-script'
end
I get random segmentation faults during rake assets:precompile that look something like this:
/Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/uglifier-1.3.0/lib/uglifier.rb:65: [BUG] Segmentation fault
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-darwin12.2.0]
c:0050 p:0055 s:0230 e:000225 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/uglifier-1.3.0/lib/uglifier.rb:65
c:0049 p:0011 s:0221 e:000220 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/actionpack-3.2.12/lib/sprockets/compressors.rb:74
c:0048 p:0010 s:0217 e:000216 BLOCK /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/processing.rb:265 [FINISH]
c:0047 p:---- s:0213 e:000212 CFUNC :call
c:0046 p:0016 s:0208 e:000207 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/processor.rb:29
c:0045 p:0034 s:0203 e:000202 METHOD /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/tilt-1.3.4/lib/tilt/template.rb:77
c:0044 p:0025 s:0197 E:000c50 BLOCK /Users/edlebert/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/sprockets-2.2.2/lib/sprockets/context.rb:193 [FINISH]
I know this isn't a very helpful issue report, but I'm hoping more people will run into this problem and find this issue.
This is very interesting.
We got a segfault on Travis on 2.0.0p0 here: https://travis-ci.org/rtomayko/tilt/jobs/5479138
But a few minutes later (after I pushed CHANGELOG change) it's green: https://travis-ci.org/rtomayko/tilt/jobs/5479184
uglifier.rb:65 doesn't make me any wiser either.
The reason why this could have started in Tilt 1.3.4 was because we started always compiling templates to a method. In 1.3.3 we did a plain instance_eval
first and then invoked the compiled method on subsequent renderings, while in 1.3.4 we never instance_eval
. I'd guess that even in 1.3.3 it might be able to get the segfault (if e.g. it's possible to trigger while your Sinatra runs).
I've opened an issue over at bugs.ruby-lang.org related to the other segfault: http://bugs.ruby-lang.org/issues/8100
I did a bunch of trial/error work on this and I found out it only happened if I did some file IO during javascript precompiling on rails. Specifically, I was exporting a bunch of ActiveSupport::TimeZone data to javascript. I found that even if I had a js.erb file that contained only this line (note that no javascript code is actually being created here), I would get the segmentation fault during rake assets:precompile.:
<% ActiveSupport::TimeZone.all %>
As a workaround, I had a hunch that if I pre-loaded the timezone data into memory before the asset precompile, I wouldn't get a segfault. So I created a rails initializer that simply pre-reads all the timezone data:
ActiveSupport::TimeZone.all
And presto, no more segmentation faults :)
Might be related to the segfault we sometimes seen in sinatra-contrib's content_for implementation on 2.0. @zzak is investigating.
Below is the smallest and most isolated script I've so far been able to produce that suffers from the segfault. I've found that as long as Tilt is rendering anything, it can segfault. I've observed that the more you do during rendering (e.g. loading files), the higher the odds of a segfault.
require 'tilt'
run(proc do
body = Tilt['str'].new{'Hello world #{test = ["1", "2"] + ["3"] }'}.render
[200, {}, [body]]
end)
Segfaults are very rare under this test though due to the simplicity of the the script; very little happens while rendering. Out of 1,162,202 HTTP requests at ~500 requests per second, I got 37 segfaults. I've done the same test at 1,003,356 requests without Tilt, just using ERB, and didn't get any segfaults. Only a few timeouts.
So it does indeed seem to be related to something Tilt is doing. It also appears to be time bound. If I throw 40 threads at the problem, even though I can push through more requests per unit of time, it seems to give the same segfault rate. Weird. I'm not sure how to proceed narrowing it down further. I could spend hours stripping down Tilt, but it may not reveal anything.
How are you running the test? Can you show us the config.ru? What server do you use?
—
Magnus Holm
On Fri, Mar 22, 2013 at 7:28 AM, Tom Wardrop notifications@github.com
wrote:
Below the smallest and most isolated script I've so far been able to produce, which suffers from the segfault. I've found that as long as Tilt is rendering anything, it can segfault. I've observed that the more you do during rendering (e.g. loading files), the higher the odds of a segfault.
require 'tilt'
run(proc do
body = Tilt['str'].new{'Hello world #{test = ["1", "2"] + ["3"] }'}.render
[200, {}, [body]]
end)
Segfaults are very rare under this test though due to the simplicity of the the script; very little happens while rendering. Out of 1,162,202 HTTP requests at ~500 requests per second, I got 37 segfaults. I've done the same test at 1,003,356 requests without Tilt, just using ERB, and didn't get any segfaults. Only a few timeouts.So it does indeed seem to be related to something Tilt is doing. It also appears to be time bound. If I throw 40 threads at the problem, even though I can push through more requests, per minute of work, it seems to give the same segfault rate. Weird. I'm not sure how to proceed narrowing it down further. I could spend hours stripping down Tilt, but it may not reveal anything.
Reply to this email directly or view it on GitHub:
#179 (comment)
I've been able to reduce it to this code which segfaults in 2.0.0-p0. Testing trunk now.
class Fail
def render(scope = Object.new)
compiled_method.bind(scope).call
end
def compiled_method
@compiled_method ||= compile_template_method
end
def source
"Hello world".inspect
end
def compile_template_method
method_name = "__tilt_#{Thread.current.object_id.abs}"
Object.class_eval("def #{method_name}; #{source} end")
unbind_compiled_method(method_name)
end
def unbind_compiled_method(method_name)
method = Object.instance_method(method_name)
Object.class_eval { remove_method(method_name) }
method
end
end
loop do
Fail.new.render
end
@judofyr that code block is the config.ru :)
I did intend to post the test script though, sorry about that. Here it is:
require 'peach'
require 'net/http'
results = {}
10 000 000.times.peach(12) do
key = begin
Net::HTTP.get_response(URI('http://localhost:3000/')).code
rescue => e
e.class
end
results[key] ||= 0
results[key] += 1
STDOUT.write "\r#{results.inspect}"
end
puts
The peach gem provides the peach
method which parallelises the requests; 12 parallel threads are allowed in this case.
@judofyr By the way, nice work on reducing it down. That's awesome. In hindsight, I have know no idea why I didn't try looping tilt within a single process. I guess I assumed the whole stack played a part and went with the brute-force over HTTP approach.
This has now been fixed in Ruby's trunk: https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/39919