NVIDIA/stdexec

nvexec::stream_context not working for any_sender_of based recursion with nvc++

weilewei opened this issue · 4 comments

Dear stdexec developers,

I am trying to run a factorial recursion algorithm on a gpu scheduler using nvc++ compiler (23.1) and it seems not working. If I use scheduler from static_thread_pool, the same code works. Do you have any suggestion if I would write a recursion like the factorial algorithm and run them on GPU? Many thanks in advance!

In order to build recursion, I used any_sender_of. But I got the nvc++ internal failure:

NVC++-F-0000-Internal compiler error. process_acc_put_dinit: unexpected datatype

example code:

// [[file:../../../async_control.org::*Simple Recursion][Simple Recursion:1]]
#include <cassert>
#include <stdexec/execution.hpp>
#include <exec/static_thread_pool.hpp>
#include <exec/any_sender_of.hpp>
#include <iostream>

#include <nvexec/stream_context.cuh>

template <class... Ts>
using any_sender_of = typename exec::any_receiver_ref<
    stdexec::completion_signatures<Ts...>>::template any_sender<>;
// Simple Recursion:1 ends here

// [[file:../../../async_control.org::*Simple Recursion][Simple Recursion:2]]
using any_int_sender =
    any_sender_of<stdexec::set_value_t(int),
                  stdexec::set_stopped_t(),
                  stdexec::set_error_t(std::exception_ptr)>;

auto fac(int n) -> any_int_sender {
    std::cout << "factorial of " << n << "\n";
    return stdexec::just(n - 1)
        | stdexec::let_value([](int k) { return (k == 0) ? stdexec::just(1) : fac(k); })
        | stdexec::then([n](int k) { return k * n; });
}
// Simple Recursion:2 ends here

// [[file:../../../async_control.org::*Simple Recursion][Simple Recursion:3]]
int main() {
    // CPU based scheduler works
    // exec::static_thread_pool pool(8);
    // stdexec::scheduler auto sch = pool.get_scheduler();

    // GPU based scheduler does not work
    nvexec::stream_context stream_ctx{};
    stdexec::scheduler auto sch = stream_ctx.get_scheduler();

    stdexec::sender auto begin = stdexec::schedule(sch);
// Simple Recursion:3 ends here

// [[file:../../../async_control.org::*Simple Recursion][Simple Recursion:4]]
    int                  k = 10;
    stdexec::sender auto factorial =
        begin
        | stdexec::then([=]() { return k; })
        | stdexec::let_value([](int k) { return fac(k); });

    std::cout << "factorial built\n\n";

    auto [i] = stdexec::sync_wait(std::move(factorial)).value();
    std::cout << "factorial " << k << " = " << i << '\n';
// Simple Recursion:4 ends here

// [[file:../../../async_control.org::*Simple Recursion][Simple Recursion:5]]
    }
// Simple Recursion:5 ends here

error message:
wwei@login24:~/src/test-factorial/build> make VERBOSE=1
/global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -S/global/homes/w/wwei/src/test-factorial -B/global/homes/w/wwei/src/test-factorial/build --check-build-system CMakeFiles/Makefile.cmake 0
/global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -E cmake_progress_start /global/homes/w/wwei/src/test-factorial/build/CMakeFiles /global/homes/w/wwei/src/test-factorial/build//CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
make -f CMakeFiles/factorial.dir/build.make CMakeFiles/factorial.dir/depend
make[2]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
cd /global/homes/w/wwei/src/test-factorial/build && /global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -E cmake_depends "Unix Makefiles" /global/homes/w/wwei/src/test-factorial /global/homes/w/wwei/src/test-factorial /global/homes/w/wwei/src/test-factorial/build /global/homes/w/wwei/src/test-factorial/build /global/homes/w/wwei/src/test-factorial/build/CMakeFiles/factorial.dir/DependInfo.cmake --color=
Dependencies file "CMakeFiles/factorial.dir/factorial.cpp.o.d" is newer than depends file "/global/homes/w/wwei/src/test-factorial/build/CMakeFiles/factorial.dir/compiler_depend.internal".
Consolidate compiler generated dependencies of target factorial
make[2]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make -f CMakeFiles/factorial.dir/build.make CMakeFiles/factorial.dir/build
make[2]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
[ 50%] Building CXX object CMakeFiles/factorial.dir/factorial.cpp.o
/opt/nvidia/hpc_sdk/Linux_x86_64/23.1/compilers/bin/nvc++ -I/global/homes/w/wwei/src/test-factorial/build/_deps/stdexec-src/include --experimental-stdpar -stdpar=gpu --gcc-toolchain=/opt/cray/pe/gcc/12.2.0/bin/ -pthread -g -O0 -std=gnu++20 -MD -MT CMakeFiles/factorial.dir/factorial.cpp.o -MF CMakeFiles/factorial.dir/factorial.cpp.o.d -o CMakeFiles/factorial.dir/factorial.cpp.o -c /global/homes/w/wwei/src/test-factorial/factorial.cpp
NVC++-F-0000-Internal compiler error. process_acc_put_dinit: unexpected datatype 4525 (/global/homes/w/wwei/src/test-factorial/factorial.cpp)
NVC++/x86-64 Linux 23.1-0: compilation aborted
make[2]: *** [CMakeFiles/factorial.dir/build.make:76: CMakeFiles/factorial.dir/factorial.cpp.o] Error 2
make[2]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make[1]: *** [CMakeFiles/Makefile2:135: CMakeFiles/factorial.dir/all] Error 2
make[1]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make: *** [Makefile:156: all] Error 2

Looks like you've run into a bug in the nvc++ compiler. Sorry about that. I'll try to reduce it and report it to the HPC compiler team. Then maybe I'll see about finding a work-around.

Oh wait, good news. It looks like the bug is fixed in more recent versions of the compiler. This shows it working with 23.5: https://godbolt.org/z/v4nvKsa77

hello again, I installed the nvhpc/23.5, ran the same code from the godbolt, but got following error. Any thoughts? Thanks.

wwei@nid001345:~/src/test-factorial/build> nvc++ --version

nvc++ 23.5-0 64-bit target on x86-64 Linux -tp zen3
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
wwei@nid001345:~/src/test-factorial/build> nvc++ -g -Minfo -std=c++20 --experimental-stdpar -stdpar --gcc-toolchain=/opt/cray/pe/gcc/12.2.0/bin/ test.cpp -o test
"/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/let_xxx.cuh", line 124: error: no instance of function template "stdexec::__connect::connect_t::operator()" matches the argument list
argument types are: (result_sender_t, nvexec::_strm::propagate_receiver_t<nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>>::__t)
object type is: const stdexec::__connect::connect_t
return stdexec::connect(
^
detected during:
instantiation of "void nvexec::_strm::let_xxx::tag_invoke(_Tag, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t &&, _As &&...) noexcept [with _Tag=stdexec::__receivers::set_value_t, _As=<int &>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__receivers::set_value_t, _Args=<nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, int &>]" at line 365 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__receivers::set_value_t::operator()(_Receiver &&, _As &&...) const noexcept [with _Receiver=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, _As=<int &>]" at line 419 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of "void nvexec::strm::operation_state_base::__t::propagate_completion_signal(Tag, As &&...) noexcept [with OuterReceiverId=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>, Tag=stdexec::__receivers::set_value_t, As=<int &>]" at line 76 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/then.cuh"
instantiation of "void nvexec::_strm::then::tag_invoke(stdexec::__receivers::set_value_t, nvexec::_strm::then::receiver_t<4UL, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t::__id, lambda ->int>::__t &&, As &&...) noexcept [with As=<>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
[ 11 instantiation contexts not shown ]
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::_strm::let_xxx::__operation<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t>]" at line 501 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of class "nvexec::strm::operation_state<CvrefSenderId, InnerReceiverId, OuterReceiverId>::__t [with CvrefSenderId=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, InnerReceiverId=nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, OuterReceiverId=nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__start::start_t, _Args=<nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t &>]" at line 1224 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t]" at line 138 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/sync_wait.cuh"
instantiation of "auto nvexec::_strm::sync_wait::sync_wait_t::operator()(nvexec::_strm::context_state_t, Sender &&) const->std::optional<nvexec::_strm::sync_wait::sync_wait_result_t> [with Sender=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>::__t]" at line 257 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream_context.cuh"

"/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/__detail/__meta.hpp", line 70: error: class "stdexec::__mdefer<stdexec::__qstdexec::__call_result_, lambda ->>" has no member "__t"
using __t = typename _T::__t;
^
detected during:
instantiation of type "stdexec::__t<stdexec::__mdefer<stdexec::__qstdexec::__call_result_, lambda ->>>" at line 548
instantiation of type "stdexec::__call_result_t<lambda ->>" at line 569
instantiation of class "stdexec::__conv<_Fn> [with _Fn=lambda ->]" at line 128 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/let_xxx.cuh"
instantiation of "void nvexec::_strm::let_xxx::tag_invoke(_Tag, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t &&, _As &&...) noexcept [with _Tag=stdexec::__receivers::set_value_t, _As=<int &>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__receivers::set_value_t, _Args=<nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, int &>]" at line 365 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
[ 14 instantiation contexts not shown ]
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::_strm::let_xxx::__operation<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t>]" at line 501 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of class "nvexec::strm::operation_state<CvrefSenderId, InnerReceiverId, OuterReceiverId>::__t [with CvrefSenderId=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, InnerReceiverId=nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, OuterReceiverId=nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__start::start_t, _Args=<nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t &>]" at line 1224 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t]" at line 138 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/sync_wait.cuh"
instantiation of "auto nvexec::_strm::sync_wait::sync_wait_t::operator()(nvexec::_strm::context_state_t, Sender &&) const->std::optional<nvexec::_strm::sync_wait::sync_wait_result_t> [with Sender=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>::__t]" at line 257 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream_context.cuh"

"/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/let_xxx.cuh", line 122: error: no instance of overloaded function "std::variant<_Types...>::emplace [with _Types=<std::monostate, exec::__any::__operation<exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr)>, stdexec::__types<>, stdexec::__types<>>::__t, nvexec::_strm::propagate_receiver_t<nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>>::__t, stdexec::__types<>>::__t>]" matches the argument list
argument types are: (stdexec::__conv<lambda ->>)
object type is: std::variant<std::monostate, exec::__any::__operation<exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr)>, stdexec::__types<>, stdexec::__types<>>::__t, nvexec::_strm::propagate_receiver_t<nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>>::__t, stdexec::__types<>>::__t>
auto& __op = __self._op_state->_op_state3.template emplace<op_state_t>(
^
detected during:
instantiation of "void nvexec::_strm::let_xxx::tag_invoke(_Tag, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t &&, _As &&...) noexcept [with _Tag=stdexec::__receivers::set_value_t, _As=<int &>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__receivers::set_value_t, _Args=<nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, int &>]" at line 365 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__receivers::set_value_t::operator()(_Receiver &&, _As &&...) const noexcept [with _Receiver=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, _As=<int &>]" at line 419 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of "void nvexec::strm::operation_state_base::__t::propagate_completion_signal(Tag, As &&...) noexcept [with OuterReceiverId=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>, Tag=stdexec::__receivers::set_value_t, As=<int &>]" at line 76 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/then.cuh"
instantiation of "void nvexec::_strm::then::tag_invoke(stdexec::__receivers::set_value_t, nvexec::_strm::then::receiver_t<4UL, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t::__id, lambda ->int>::__t &&, As &&...) noexcept [with As=<>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
[ 11 instantiation contexts not shown ]
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::_strm::let_xxx::__operation<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t>]" at line 501 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of class "nvexec::strm::operation_state<CvrefSenderId, InnerReceiverId, OuterReceiverId>::__t [with CvrefSenderId=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, InnerReceiverId=nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, OuterReceiverId=nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__start::start_t, _Args=<nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t &>]" at line 1224 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t]" at line 138 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/sync_wait.cuh"
instantiation of "auto nvexec::_strm::sync_wait::sync_wait_t::operator()(nvexec::_strm::context_state_t, Sender &&) const->std::optional<nvexec::_strm::sync_wait::sync_wait_result_t> [with Sender=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>::__t]" at line 257 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream_context.cuh"

"/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/let_xxx.cuh", line 129: error: no instance of function template "stdexec::__start::start_t::operator()" matches the argument list
argument types are: ()
object type is: const stdexec::__start::start_t
stdexec::start(__op);
^
detected during:
instantiation of "void nvexec::_strm::let_xxx::tag_invoke(_Tag, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t &&, _As &&...) noexcept [with _Tag=stdexec::__receivers::set_value_t, _As=<int &>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__receivers::set_value_t, _Args=<nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, int &>]" at line 365 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__receivers::set_value_t::operator()(_Receiver &&, _As &&...) const noexcept [with _Receiver=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>::__t, _As=<int &>]" at line 419 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of "void nvexec::strm::operation_state_base::__t::propagate_completion_signal(Tag, As &&...) noexcept [with OuterReceiverId=nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple>, Tag=stdexec::__receivers::set_value_t, As=<int &>]" at line 76 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/then.cuh"
instantiation of "void nvexec::_strm::then::tag_invoke(stdexec::__receivers::set_value_t, nvexec::_strm::then::receiver_t<4UL, nvexec::_strm::let_xxx::_receiver<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<stdexec::__id<nvexec::_strm::sync_wait::sync_wait_t::receiver_t<std::remove_reference<stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__let::let_value_t, lambda ->any_int_sender>, stdexec::__call_result_t<stdexec::__closure::__binder_back<stdexec::__then::then_t, lambda ->int>, stdexec::__tag_invoke::tag_invoke_result_t<stdexec::__schedule::schedule_t, nvexec::_strm::stream_scheduler &> &>> &>::type>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t, std::tuple<std::decay<std::enable_if<true, int>::type>::type>>::__t::__id, lambda ->int>::__t &&, As &&...) noexcept [with As=<>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
[ 11 instantiation contexts not shown ]
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::_strm::let_xxx::__operation<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, lambda ->any_int_sender, stdexec::__receivers::set_value_t>]" at line 501 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/common.cuh"
instantiation of class "nvexec::strm::operation_state<CvrefSenderId, InnerReceiverId, OuterReceiverId>::__t [with CvrefSenderId=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, InnerReceiverId=nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, OuterReceiverId=nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>]" at line 106 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/functional.hpp"
instantiation of "auto stdexec::__tag_invoke::tag_invoke_t::operator()(_Tag, _Args &&...) const->stdexec::__tag_invoke::tag_invoke_result_t<_Tag, _Args...> [with _Tag=stdexec::__start::start_t, _Args=<nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t &>]" at line 1224 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/stdexec/execution.hpp"
instantiation of "void stdexec::__start::start_t::operator()(_Op &) const noexcept [with _Op=nvexec::strm::operation_state<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>, nvexec::_strm::propagate_receiver_t<nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>, nvexec::_strm::sync_wait::receiver_t<nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>>>::__t]" at line 138 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream/sync_wait.cuh"
instantiation of "auto nvexec::_strm::sync_wait::sync_wait_t::operator()(nvexec::_strm::context_state_t, Sender &&) const->std::optional<nvexec::_strm::sync_wait::sync_wait_result_t> [with Sender=nvexec::_strm::let_sender_t<nvexec::_strm::then_sender_t<nvexec::strm::stream_scheduler::sender::__t::__id, lambda ->int>, lambda ->any_int_sender, stdexec::_Xstdexec::__receivers::set_value_t::_T>::__t]" at line 257 of "/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/include-stdexec/experimental/nvexec/stream_context.cuh"

4 errors detected in the compilation of "test.cpp".

If I pull latest stdexec instead of the compiler one, I got this error:

wwei@nid001345:~/src/test-factorial/build> nvc++ --version

nvc++ 23.5-0 64-bit target on x86-64 Linux -tp zen3 
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
wwei@nid001345:~/src/test-factorial/build> gcc --version
gcc (GCC) 12.2.0 20220819 (HPE)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

wwei@nid001345:~/src/test-factorial/build> CXX=nvc++ CC=gcc cmake -DCMAKE_CXX_FLAGS="--experimental-stdpar -stdpar=gpu --gcc-toolchain=/opt/cray/pe/gcc/12.2.0/bin/" -DCMAKE_BUILD_TYPE=Debug ..
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is NVHPC 23.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/cray/pe/gcc/12.2.0/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/bin/nvc++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Downloading CPM.cmake to /global/homes/w/wwei/src/test-factorial/build/cmake/CPM_0.34.0.cmake
-- CPM: adding package stdexec@ (main)
-- System           : Linux-5.14.21-150400.24.46_12.0.73-cray_shasta_c
-- System name      : Linux
-- System ver       : 5.14.21-150400.24.46_12.0.73-cray_shasta_c
-- 
-- Library ver      : 0.8.0
-- Build date       : 2023-08-08
-- Build year       : 2023
-- 
CMake Warning (dev) at build/cmake/CPM_0.35.6.cmake:37 (message):
  CPM: stdexec: A dependency is using a more recent CPM version (0.35.6) than
  the current project (0.34.0).  It is recommended to upgrade CPM to the most
  recent version.  See https://github.com/cpm-cmake/CPM.cmake for more
  information.
Call Stack (most recent call first):
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/detail/download.cmake:85 (include)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/init.cmake:65 (rapids_cpm_download)
  build/_deps/stdexec-src/CMakeLists.txt:82 (rapids_cpm_init)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- CPM: stdexec: adding package Catch2@2.13.6 (2.13.6)
CMake Warning (dev) at /global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/share/cmake-3.24/Modules/FetchContent.cmake:1267 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
Call Stack (most recent call first):
  build/cmake/CPM_0.34.0.cmake:780 (FetchContent_Declare)
  build/cmake/CPM_0.34.0.cmake:667 (cpm_declare_fetch)
  build/cmake/CPM_0.34.0.cmake:262 (CPMAddPackage)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/find.cmake:167 (CPMFindPackage)
  build/_deps/stdexec-src/CMakeLists.txt:88 (rapids_cpm_find)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Configuring done
-- Generating done
-- Build files have been written to: /global/homes/w/wwei/src/test-factorial/build
wwei@nid001345:~/src/test-factorial/build> make VERBOSE=1
/global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -S/global/homes/w/wwei/src/test-factorial -B/global/homes/w/wwei/src/test-factorial/build --check-build-system CMakeFiles/Makefile.cmake 0
/global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -E cmake_progress_start /global/homes/w/wwei/src/test-factorial/build/CMakeFiles /global/homes/w/wwei/src/test-factorial/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
make  -f CMakeFiles/factorial.dir/build.make CMakeFiles/factorial.dir/depend
make[2]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
cd /global/homes/w/wwei/src/test-factorial/build && /global/common/software/nersc/pm-2022q4/spack/linux-sles15-zen/cmake-3.24.3-k5msymx/bin/cmake -E cmake_depends "Unix Makefiles" /global/homes/w/wwei/src/test-factorial /global/homes/w/wwei/src/test-factorial /global/homes/w/wwei/src/test-factorial/build /global/homes/w/wwei/src/test-factorial/build /global/homes/w/wwei/src/test-factorial/build/CMakeFiles/factorial.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make  -f CMakeFiles/factorial.dir/build.make CMakeFiles/factorial.dir/build
make[2]: Entering directory '/global/u2/w/wwei/src/test-factorial/build'
[ 50%] Building CXX object CMakeFiles/factorial.dir/factorial.cpp.o
/pscratch/sd/w/wwei/nvhpc_23_5/Linux_x86_64/23.5/compilers/bin/nvc++  -I/global/homes/w/wwei/src/test-factorial/build/_deps/stdexec-src/include --experimental-stdpar -stdpar=gpu --gcc-toolchain=/opt/cray/pe/gcc/12.2.0/bin/ --experimental-stdpar -stdpar --gcc-toolchain=/opt/cray/pe/gcc/12.2.0/bin/ -pthread -g -O0 -std=gnu++20 -MD -MT CMakeFiles/factorial.dir/factorial.cpp.o -MF CMakeFiles/factorial.dir/factorial.cpp.o.d -o CMakeFiles/factorial.dir/factorial.cpp.o -c /global/homes/w/wwei/src/test-factorial/factorial.cpp
"/global/homes/w/wwei/src/test-factorial/build/_deps/stdexec-src/include/exec/any_sender_of.hpp", line 456: error: global or namespace scope variables such as "exec::__any::__null_storage_vtbl [with _ParentVTable=exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr *)>, stdexec::__types<>, stdexec::__types<>>::__vtable, _StorageCPOs=<exec::__any::__delete_t (void (*)() noexcept), exec::__any::__move_construct_t (void (*)(exec::__any::__storage<exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr *)>, stdexec::__types<>, stdexec::__types<>>::__vtable, std::allocator<std::byte>, false, 16UL, 24UL>::__t &&) noexcept)>]" (declared at line 222) cannot be accessed from device code
            function "exec::__any::__storage<_Vtable, _Allocator, _Copyable, _Alignment, _InlineSize>::__t::__reset [with _Vtable=exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr *)>, stdexec::__types<>, stdexec::__types<>>::__vtable, _Allocator=std::allocator<std::byte>, _Copyable=false, _Alignment=16UL, _InlineSize=24UL]" is implicitly a device function because it is called from device function "exec::__any::__storage<_Vtable, _Allocator, _Copyable, _Alignment, _InlineSize>::__t::~__t [with _Vtable=exec::__any::__sender<stdexec::completion_signatures<stdexec::__receivers::set_value_t (int), stdexec::__receivers::set_stopped_t (), stdexec::__receivers::set_error_t (std::__exception_ptr::exception_ptr *)>, stdexec::__types<>, stdexec::__types<>>::__vtable, _Allocator=std::allocator<std::byte>, _Copyable=false, _Alignment=16UL, _InlineSize=24UL]" (declared at line 449)
          __vtable_ = __default_storage_vtable((__vtable_t*) nullptr);
                      ^

1 error detected in the compilation of "/global/homes/w/wwei/src/test-factorial/factorial.cpp".
make[2]: *** [CMakeFiles/factorial.dir/build.make:76: CMakeFiles/factorial.dir/factorial.cpp.o] Error 2
make[2]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make[1]: *** [CMakeFiles/Makefile2:135: CMakeFiles/factorial.dir/all] Error 2
make[1]: Leaving directory '/global/u2/w/wwei/src/test-factorial/build'
make: *** [Makefile:156: all] Error 2