Inefficient code generation for named tuple returns
Closed this issue · 1 comments
The following "mktuple()" functions produce very different code. Notice the repeated and unproductive make_tuple(...)
invocations for each of the offsets for the named variants. Large named tuples cause unnecessary compile and runtime overhead. With bytes elements, this likely involves many unnecessary memory allocations and memcpy
calls.
module Test;
function named_mktuple(p: point): tuple<x: uint8, y: uint8, z: uint8> {
return (p.x, p.y, p.z);
}
function unnamed_mktuple(p: point): tuple<uint8, uint8, uint8> {
return (p.x, p.y, p.z);
}
public type point = unit {
x: uint8;
y: uint8;
z: uint8;
on %done {
print named_mktuple(self);
print unnamed_mktuple(self);
}
};
static auto __hlt::Test::unnamed_mktuple(const ::hilti::rt::ValueReference<__hlt::Test::point>& p)
-> std::tuple<::hilti::rt::integer::safe<uint8_t>, ::hilti::rt::integer::safe<uint8_t>,
::hilti::rt::integer::safe<uint8_t>> {
::hilti::rt::detail::checkStack();
__location__("tuple-return.spicy:8:3");
return std::make_tuple(::hilti::rt::optional::value((*p).x), ::hilti::rt::optional::value((*p).y),
::hilti::rt::optional::value((*p).z));
}
static auto __hlt::Test::named_mktuple(const ::hilti::rt::ValueReference<__hlt::Test::point>& p)
-> std::tuple<::hilti::rt::integer::safe<uint8_t>, ::hilti::rt::integer::safe<uint8_t>,
::hilti::rt::integer::safe<uint8_t>> {
::hilti::rt::detail::checkStack();
__location__("tuple-return.spicy:4:3");
return std::make_tuple(std::get<0>(std::make_tuple(::hilti::rt::optional::value((*p).x),
::hilti::rt::optional::value((*p).y),
::hilti::rt::optional::value((*p).z))),
std::get<1>(std::make_tuple(::hilti::rt::optional::value((*p).x),
::hilti::rt::optional::value((*p).y),
::hilti::rt::optional::value((*p).z))),
std::get<2>(std::make_tuple(::hilti::rt::optional::value((*p).x),
::hilti::rt::optional::value((*p).y),
::hilti::rt::optional::value((*p).z))));
}
Thanks for the ticket! This is due to the codegen currently not handling temporary tuple efficiently (which for function return values is probably a main use case),
spicy/hilti/toolchain/src/compiler/codegen/coercions.cc
Lines 166 to 170 in 19de277
As a workaround one could declare a local variable with the tuple with individual fields casted to the correct type and return that. This would avoid the overhead of creating each tuple element from a full tuple and instead create it from a single value.