kokkos/stdBLAS

Namespace issue for template specialization with nvc++

KeithBallard opened this issue · 2 comments

Greetings,

I came across a small bug, which very well might in the compiler. I am evaluating nvc++ with std::linalg+std::mdspan+std::par, and I have found that nvc++ does not like the fact that a template is first declared in an anonymous namespace and later specialized outside the anonymous namespace.

Details:

  • File: blas3_matrix_product.hpp
  • is_custom_matrix_product_avail is declared in an anonymous namespace (std::experimental::_p1673_version_0::linalg::)
  • It is specialized later in the file outside the anonymous namespace (std::experimental::_p1673_version_0::linalg)
  • I have tested that the error occurs with nvc++ 23.5-0
  • The error does not occur with g++/gcc 11.3

Possible resolutions:

  1. Removing the anonymous namespace all together
  2. Move the specialization inside the namespace
  3. Declare the anonymous namespace within an ifdef guard that detects nvc++

I opted for option 1 for now, for which I will submit a pull request. However, the developers may have reasons for the anonymous namespace that I do not readily perceive.

PS: fantastic project! I love the proposal and am very grateful to the contributors to hopefully make this a reality. As a C++ developer of simulation software, this project greatly improves expressiveness without giving up much performance (if any in many case). I truly hope the standards committee accepts it sooner rather than later.

@KeithBallard Greetings and thanks for your interest, and for testing with nvc++! FYI, recent versions of NVIDIA's HPC SDK have a cuBLAS-accelerated std::linalg implementation that should work with nvc++. If you have a chance to try it out, we would welcome your feedback!

However, the developers may have reasons for the anonymous namespace that I do not readily perceive.

It's not actually an anonymous namespace; it's an inline namespace. Correction: I actually read your PR : - ) Thanks for submitting this! I'd like to investigate a bit more to make sure this isn't actually an nvc++ bug, but the workaround should still be useful.

I've merged PR #259, thus closing this issue. Please feel free to reopen or file a new issue if needed. Thanks for your contribution! : - )