japaric/cargo-call-stack

Supporting dynamic dispatch

japaric opened this issue · 0 comments

Background information

This tool builds the call graph from LLVM IR.

Direct function calls look like this in LLVM IR.

  call fastcc void @_ZN6direct3foo17h2eeb2dcd0a346d49E(), !dbg !112

From the IR we know the name of the callee.

And indirect function calls look like this:

  %4 = tail call i32 %3({}* nonnull %0) #1, !dbg !354

There's not much information about the callee. We can't even tell if this is a trait object or a function pointer so the tool can't reason about indirect function calls.

Proposed rustc changes

Our proposal is to make rustc include type information as call-site metadata in the generated LLVM IR.

When trait objects are used the call site shall include metadata that specifies the name (path) of the trait and the name of the method being called.

; let x: &dyn path::to::Trait = ..;
; let y = x.method_name();
  %4 = tail call i32 %3({}* nonnull %0) #1, !dbg !354, !rust !123

!123 = !{!"path::to::Trait::method_name"}

When function pointers are used the call site shall include metadata that specifies the type signature of the function pointer.

; let x: fn() -> i32 = ..;
; let y = x();
  %4 = tail call i32 %3() #1, !dbg !354, !rust !234

!234 = !{!"fn() -> i32"}

Additionally, the definition of all trait methods should include metadata that specifies the name (path) of the trait and the name of the method.

; impl path::to::Trait for Type { fn method_name() { .. }}
define internal i32 @name() unnamed_addr #1 !dbg !216, !rust 123 {
  ..
}

!123 = !{!"path::to::Trait::method_name"}

And the definition of all functions shall include the metadata that specifies its type signature.

define internal i32 @name() unnamed_addr #1 !dbg !216, !rust 234 {
  ..
}

!234 = !{!"fn() -> i32"}

These metadata changes could be provided behind an unstable compiler flag (e.g. -Z emit-extra-metadata).

Changes in the tool

For each trait method implementation kept in the final binary the tool will insert an edge between that symbol and a node named, for example, dyn path::to::Trait::method_name.

  "dyn Trait::method" -> "<Type1 as Trait>::method"
  "dyn Trait::method" -> "<Type2 as Trait>::method"

The dyn nodes will be considered to have local stack usage of 0 bytes and each method call done through a trait object will be connected to one of these dyn nodes.

For each function definition kept in the final binary the tool will insert an edge between that symbol and a node named after its type signature, e.g. fn() -> i32.

  "fn() -> i32" -> "foo"
  "fn() -> i32" -> "bar"

The fn nodes will be considered to have local stack usage of 0 bytes and each function call done through a function pointer will be connected to one of these fn nodes.