simple function which should specialise causes data to be copied on stack
dexterlb opened this issue · 2 comments
dexterlb commented
Hello,
I'm sorry if this is not an impala issue. I have the following code:
struct Data {
item: i32,
}
extern fn foo(arr: &[Data], i: i32) -> () {
let f = @ |i| arr(i);
cpu_prefetch(&f(i) as &u8, 0, 3, 1)
}
extern fn bar(arr: &[Data], i: i32) -> () {
cpu_prefetch(&arr(i) as &u8, 0, 3, 1)
}
Which yields the following result (latest commits on mem2reg branch of impala and thorin):
0000000000001130 <foo>:
1130: movslq %esi,%rax
1133: mov (%rdi,%rax,4),%eax
1136: mov %eax,-0x8(%rsp)
113a: prefetcht0 -0x8(%rsp)
113f: retq
0000000000001140 <bar>:
1140: movslq %esi,%rax
1143: prefetcht0 (%rdi,%rax,4)
1147: retq
1148: nopl 0x0(%rax,%rax,1)
I would expect f
to inline and the code for both functions to be the same. Instead, data gets copied on the stack before the prefetch, which makes the latter useless.
richardmembarth commented
Did you try on master? mem2reg has been merged into master.
madmann91 commented
This is an issue in your code. f
is of type fn (i32) -> i32
. In foo
, you are capturing the address of a return value, which is a temporary on the stack. The proper way of doing this is the following:
extern fn foo(arr: &[Data], i: i32) -> () {
let f = @ |i| &arr(i);
cpu_prefetch(f(i) as &u8, 0, 3, 1)
}
Note that f
now captures the address of the array element at index i
.