NVIDIA/stdexec

`default_domain::sender_transform()` applied to an rvalue should return an xvalue instead of a prvalue

lewissbaker opened this issue · 0 comments

The specification in P2300 says that sender_transform() should be expression equivalent to std::forward<Sndr>(sndr) if the tag_of_t<Sndr> doesn't have its own sender_transform() overload.

However, the current implementation is forcibly returning a prvalue that requires move-constructing the input sender.

// Called without the environment during eager customization
template <class _Sender>
STDEXEC_ATTRIBUTE((always_inline))
decltype(auto)
transform_sender(_Sender&& __sndr) const
noexcept(__domain::__is_nothrow_transform_sender<_Sender>()) {
// Look for a legacy customization for the given tag, and if found, apply it.
if constexpr (__callable<__sexpr_apply_t, _Sender, __domain::__legacy_customization>) {
return stdexec::__sexpr_apply(
static_cast<_Sender&&>(__sndr), __domain::__legacy_customization());
} else if constexpr (__domain::__has_default_transform_sender<_Sender>) {
return tag_of_t<_Sender>().transform_sender(static_cast<_Sender&&>(__sndr));
} else {
return static_cast<_Sender>(static_cast<_Sender&&>(__sndr));
}
}
// Called with an environment during lazy customization
template <class _Sender, class _Env>
STDEXEC_ATTRIBUTE((always_inline))
decltype(auto)
transform_sender(_Sender&& __sndr, const _Env& __env) const
noexcept(__domain::__is_nothrow_transform_sender<_Sender, _Env>()) {
if constexpr (__domain::__has_default_transform_sender<_Sender, _Env>) {
return tag_of_t<_Sender>().transform_sender(static_cast<_Sender&&>(__sndr), __env);
} else {
return static_cast<_Sender>(static_cast<_Sender&&>(__sndr));
}
}

I think we need to change the line as follows so it returns an xvalue when passed an rvalue

-return static_cast<_Sender>(static_cast<_Sender&&>(__sndr));
+return static_cast<_Sender&&>(__sndr);

We may also want to update some algorithms to avoid making copies / take advantage of copy-elision in cases when we are using the default-domain which has a no-op eager sender_transform.

e.g. see finally() algorithm implementation:
https://github.com/NVIDIA/stdexec/blob/4e573c3617a045c7b58ca007df373f6721f839dd/include/exec/finally.hpp#L306-314

This first constructs the default finally-sender using __make_sexpr<finally_t> which necessarily moves the input senders into this sender. But then because it passes through transform_sender, we can't take advantage of copy-elision and so in cases where the domain does not have a transform for the input sender, we end up having to do a move when returning from operator() anyway.

I think to get around this extra copy you'd need to do something like:

if constexpr (domain_has_customization_for<Domain, SenderType>) {
  return stdexec::transform_sender(domain, __make_sexpr<CPO>(...), env);
} else {
  return __make_sexpr<CPO>(...);
}

Alternatively, we could have transform_sender() take a lambda that produces the sender and then we can do copy-elision on return path through both transform_sender and operator().