Shopify/rotoscope

Methods defined via blocks produce incorrect call-locations

Closed this issue · 2 comments

Introduction

As identified in #38.

When using dynamically defined methods via define_method and define_singleton_method, the call stack is inconsistent compared with how predefined Ruby methods look.

Failing test here:

def test_dynamic_methods_in_blacklist
skip <<-FAILING_TEST_CASE
Return events for dynamically created methods (define_method, define_singleton_method)
do not have the correct stack frame information (the call of a dynamically defined method
is correctly treated as a Ruby :call, but its return must be treated as a :c_return)
FAILING_TEST_CASE
contents = rotoscope_trace(blacklist: [MONADIFY_PATH]) { Example.apply("my value!") }
assert_equal [
{ event: "call", entity: "Example", method_name: "apply", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "call", entity: "Example", method_name: "monad", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "return", entity: "Example", method_name: "monad", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
{ event: "return", entity: "Example", method_name: "apply", method_level: "class", filepath: "/rotoscope_test.rb", lineno: -1 },
], parse_and_normalize(contents)
end

Problem

On :call events, the frame points to where the method is received in Ruby, i.e. the block passed to define_method or define_singleton_method. This is the expected behaviour for Ruby invocations:

[TracePoint] has the unfortunate side effect of pointing to the callee of the method (i.e. where the method is received), rather than the caller.

But, when the :return event fires, instead of pointing to the end of the method definition like a normal ruby method, it points one frame up the stack, to where the original caller occurred. Since we assume Ruby methods point one level deeper, Rotoscope pops the frame one level further, and points to where the caller's caller came from. This means the call and return events won't match up.

Solution?

I tried to figure out how to get around this one (by using Ruby's bitmasks to determine if it was a dynamically defined method, etc.), but I've been unsuccessful thus far.

This problem is currently avoidable via flatten: true, since unmatched returns are ignored.

The problem isn't with define_method or define_singleton_method, since using something like define_method(:foo, instance_method(:bar)) doesn't cause this problem. The problem is with methods defined using a block.

I was curious about how they are implemented in these cases, and found that for normal methods (VM_METHOD_TYPE_ISEQ) that trace is a vm instruction, which you can see by disassembling a method

irb(main):001:0> def foo; end
=> :foo
irb(main):002:0> method(:foo)
=> #<Method: Object#foo>
irb(main):003:0> puts RubyVM::InstructionSequence.disasm(method(:foo))
== disasm: #<ISeq:foo@(irb)>============================================
0000 trace            8                                               (   1)
0002 putnil
0003 trace            16                                              (   1)
0005 leave

where the constants 8 and 16 are the values of RUBY_EVENT_CALL and RUBY_EVENT_RETURN. Since it is just another in the method body, it is executed before stack frame is popped.

In contrast, a VM_METHOD_TYPE_BMETHOD method is invoked using invoke_bmethod which pushes a frame using the block's instruction sequence, executes the RUBY_EVENT_CALL event hook, calls vm_exec then executes the RUBY_EVENT_RETURN event hook. Since the block's instruction sequence ends with a leave instruction that pops the frame, the RUBY_EVENT_RETURN event hook is executed with the caller stack frame on top of the stack.

I don't know of a clean way to distinguish between these types of methods. However, this seems like an inconsistency that should be fixed upstream in ruby so we don't have to try to workaround this by reaching into ruby internals.

I don't know of a clean way to distinguish between these types of methods.

Actually, this could be done using RubyVM::InstructionSequence.of(method_object)&.label&.start_with?("block in <"). That would allocate a RubyVM::InstructionSequence object, which would be slow to do for every method call, but maybe we could just do that when we get a return that doesn't match the top of rotoscope's call stack. Once we confirm that the inconsistency is because the method is a block method, then we can use the location information from the call to correct the return information.