stan-dev/math

stan::math and std::complex

WardBrian opened this issue · 7 comments

Description

Currently, Stan's complex number support is entirely built on std::complex, including autodiff, which
uses std::complex<stan::math::var>.

This is, unfortunately, unspecified behavior in the C++ spec [26.4.2]:

The effect of instantiating the template complex for any type other than float, double, or long double is unspecified. The specializations complex<float>, complex<double>, and complex<long double> are literal types

For a reminder on what "unspecified behavior" means:

unspecified behavior - the behavior of the program varies between implementations, and the conforming implementation is not required to document the effects of each behavior. Each unspecified behavior results in one of a set of valid results.

Essentially, unspecified behavior is the same as "implementation-defined behavior" but without the requirement that implementations document what they are doing. This is also often taken to mean there are no backwards compatibility guarantees on any specific unspecified behavior.

This creates both a maintenance burden (each new libstdc++/libc++ release can create arbitrary amounts of work for our developers) and a stability hazard (the idea that "Stan X.Y will continue to work a year from now, without needing to update to Stan X.Z" is false as things stand today)

Problems

Recent versions of clang/libstdc++ have made changes which they are fully within their rights to do by the spec, but have broken Stan builds.

  • In libstdc++16, they changed the definition of log(complex) from complex<T>(log(abs(x)), arg(x)); to complex<T>(std::log(std::abs(x)), std::arg(x));. This broke argument dependent lookup for this function.
    A similar change broke operator* for our complex types.
    This lead to stan-dev/cmdstan#1158, which was the reason we needed a 2.32.1 release.
    @andrjohns provided the fix in #2892
  • In libstdc++17, a similar change was made to fabs, which necessitated to #2991
  • In libstdc++19, the internal structure of pow was rewritten such that several overloads lead to a static assert failing if the type passed was not arithmetic: #3106

What to do

This is less clear to me.

Option 1 - walk on egg shells

So far, all of the issues that have arisen from this have been due to argument dependent lookup breaking for these types. We can fix that by being much more explicit, as we did in #2892 and #2991. This requires auditing the existing usages, which probably requires a fair amount of C++ expertise to understand how the calls are being resolved.

Option 2 - our own type

We could rather trivially define our own stan::math::complex<T> type. We could make it assignable from std::complex<double>, and I think be off to the races? I believe the complex linear algebra we use in Eigen all support a template argument for the complex type, rather than assuming std::complex.
This would require a fair amount of boilerplate to actually do any math on it, and in the case of double we may lose out on some of the optimizations that having the type built in to the language grants, but we'd own it.

syclik commented

@WardBrian, thanks for writing this down! I think relying on undefined behavior according to the C++ specification leads directly to this.

Thank you for laying out the two options. They seem like the reasonable set of options; I can't think of another option to consider.


Tradeoffs Between Option 1 and 2

In general, I think Option 1 is better than Option 2 only under these conditions:

  • the footprint of the undefined behavior is small (in the sense that we can support it as C++ compilers and libraries change their stance on the undefined behavior)
  • how we think it should work is consistent
  • we can set up tests for the undefined behavior that will trigger when behavior we're relying on changes
  • we are able to selectively override behavior to what we need

I think Option 2 is better than Option 1 here, even if we have to write tests for almost all the behavior we rely on.

I prefer option 2, since it eventually gets us to an island of stability. The downside is it kind of needs to be done monolithically/all at once, which is a lot of work

Option 1 seems like we could eventually reach some kind of stable state where we're not using any ADL at all (we're testing against new clang/gcc versions to catch breaks early), but then the compilers could design to break something else too.

Latest issue that arose from this is #3106

In the issue we opened against LLVM, maintainer said: llvm/llvm-project#109858 (comment)

Not sure whether the "fix" would be preferred because the standard says that instantiating std::complex<NonFloatingPoint> has unspecified effects...

It appears in this instance they're willing to entertain the idea of changing their signatures to un-break our use case, but this makes me even more convinced it is worth us defining our own complex struct.

With the upcoming major version bump, we could define a stan::math::complex as a alias for std::complex and update our code such that you have to use this, and then later change it to a bespoke struct without that being an (API) break, I think? It would still be an ABI change, but I'm not sure stan-math has been ABI stable for a while

@SteveBronder?

Other autodiff libraries seem to have hit this themselves:

https://20k.github.io/c++/2024/05/18/forward-backward-differentiation.html#sidebar-stdcomplex-is-completely-unusable-here

This makes the interesting note that var<std::complex<double>> would be totally fine by the standard. I think this is a viable option, and it seems to be what e.g. the autodiff library recommends:

https://autodiff.github.io/tutorials/#derivatives-of-a-single-variable-function-using-a-custom-scalar-complex

I'd like us to look into var_value<complex>

I'd like us to look into var_value<complex>

Can we write a complex_for_t<T> template that expands to stan::math::var_value<complex<double>> if T is var and std::complex<T> otherwise? I think that would make the compiler have an easier time