kokkos/stdBLAS

Infinite recursion on missing implementation

Closed this issue · 6 comments

@mhoemmen @crtrott @fnrizzi

If an algorithm is called with execution policy that has identical mapping (i.e. execpolicy_mapper returns same type as it gets) - like kokkos_exec<ExecSpace> - but implementation (or that particular overload) is missing, the dispatcher enters infinite recursion. It results in SEGFAULT crash in Debug builds or hangs in Release builds (e.g. see the hanging tests cancelled after 25m).

This happens because is_custom_<ALG>_avail checks mistake the dispatcher routine for requested custom implementation (as - with id policy mapping - their function signatures are identical).

One possible solution would be to have is_custom_<ALG>_avail checks detect that the dispatcher routine is detected as the custom function like:

std::is_same_v< std::experimental::linalg::triangular_matrix_left_product,  // the dispatcher
                    /* ADL namespace :: */ triangular_matrix_left_product > // detected custom implementation

For example:

template <class Exec, class A_t, class Tr_t, class DiagSt_t, class C_t>
struct is_custom_triang_mat_left_product_with_update_avail<
  Exec, A_t, Tr_t, DiagSt_t, C_t,
  std::enable_if_t<
    std::is_void_v<
      decltype(triangular_matrix_left_product
        (std::declval<Exec>(),
        std::declval<A_t>(),
        std::declval<Tr_t>(),
        std::declval<DiagSt_t>(),
        std::declval<C_t>()))>

    // check if custom implementation is different than the dispatcher to prevent inf recursion
    && !std::is_same_v<
      decltype(std::experimental::linalg::triangular_matrix_left_product // the dispatcher
        (std::declval<Exec>(),
        std::declval<A_t>(),
        std::declval<Tr_t>(),
        std::declval<DiagSt_t>(),
        std::declval<C_t>())),
      decltype(triangular_matrix_left_product // ADL-detected custom implementation
        (std::declval<Exec>(),
        std::declval<A_t>(),
        std::declval<Tr_t>(),
        std::declval<DiagSt_t>(),
        std::declval<C_t>()))>

    && !linalg::impl::is_inline_exec_v<Exec>>
>: std::true_type{};

@crtrott just to clarify that Mikolaj and I talked about this, and decided to open the issue but even if you find it useful, we will not address it right away because we first want to complete the Kokkos implementations to satisfy the contract. We can follow up on this for sure later

It looks like this was fixed, so I'll close the issue -- please reopen if it still exists. Thanks!

@mhoemmen I don't quite understand the scenario that had infinite recursion yet, but I have a case with identical mapping, and the implementation is available, which works without this fix, and doesn't work with this fix.

@youyu3 Please feel free to reopen if this is still broken; thanks!

@mhoemmen @youyu3 @fnrizzi

I've created PR #249 to revert PR #222.

As pointed out by @srinivasyadav18 on #248 (probably first mentioned by @youyu3 above) the implementation is completely incorrect, because that decltype formulation proposed above puts function's return type into comparison instead of its full signature. And even that wouldn't work, since compared functions do have same the same type/signature - they just lay in different namespaces.

I can't see any good implementation of the concept based on comparing functions yet, because even if there's a good way to compare functors, TPL implementations can have arbitrary template parameters which don't seem to be derivable from function parameter types available in is_custom_<ALG>_avail...