[stdlib_linalg] Issue with `maxval(abs(B-eye(k))` where `B` is a complex matrix.

Question

[stdlib_linalg] Issue with `maxval(abs(B-eye(k))` where `B` is a complex matrix.

loiseaujc opened this issue 10 months ago · 6 comments

Description

I have a surprising issue when combining fortran intrinsic functions maxval and abs with some functions from stdlib.
Consider the code below.

program main
  ! > Import stdlib_linalg stuff.
  use stdlib_linalg_constants, only: sp, ilp
  use stdlib_linalg, only: eye, qr
  implicit none

  ! > Local variables.
  integer(ilp), parameter :: k = 256
  complex(sp) :: A(1:k, 1:k), B(1:k, 1:k), C(1:k, 1:k)
  complex(sp) :: Q(1:k, 1:k), R(1:k, 1:k)
  real(sp) :: alpha, beta
  integer(ilp) :: i, j

  ! > Initialize A.
  A = 0.0_sp
  do j = 1, k
    do i = 1, k
      call random_number(alpha); call random_number(beta)
      A(i, j) = cmplx(alpha, beta, kind=sp)
    enddo
  enddo

  ! > Perform QR decomposition.
  call qr(A, Q, R)

  ! > Gram matrix.
  B = matmul( transpose(conjg(Q)), Q)

  ! > Difference matrix.
  C = B - eye(k)

  write(*, *) "maxval(abs(B - eye(k))) = ", maxval(abs(B-eye(k)))
  write(*, *) "maxval(abs(C)) = ", maxval(abs(C))

end program main

This is an extremely simplified version of what we're doing in a library we're working on where the call to qr followed by B = matmul(transpose(conjg(Q)), Q) basically emulates a complicated function that should return an orthonormal set of vectors. In our unit tests, we then simply use maxval(abs(B - eye(k))) which should be of the order of the machine precision to ensure that B is indeed the Gram matrix of an orthonormal set of vectors.

When compiling this code with fpm run --compiler "gfortran", here is the output

 maxval(abs(B - eye(k))) =    1.0000000000000000     
 maxval(abs(C)) =    1.31130253E-06

While the second print statement is correct, the first one is not.
For comparison, if I compile with fpm run --compiler "ifx", I get the expected output

 maxval(abs(B - eye(k))) =  9.536743164062500E-007
 maxval(abs(C)) =  9.5367432E-07

I'm running with gfortran 13.3 btw. To me, the error comes from gfortran (possible type conversion) rather than stdlib but I just wanted to raise the issue here to see if anyone can reproduce before reporting this possible gfortran bug.

Expected Behaviour

The two statements maxval(abs(B - eye(k))) and maxval(abs(C)) should return the same value.

Version of stdlib

latest

Platform and Architecture

Ubuntu 22.04, Linux ,gfortran 13.3

Additional Information

No response

Answer 1 · 2025-01-20T13:50:39.000Z

@perazz, @jalvesz, @jvdp1 : any ideas ?

Answer 2 · 2025-01-20T17:01:21.000Z

@loiseaujc I copy-pasted your mwe into Compiler Explorer and changed the gfortran versions, I saw that this error happens with version 13.1, not with 13.2, comes back with gfortran 14.1. I suspect this to be related with temporary arrays being created in the first case. I would say this is a compiler bug and not an issue with stdlib per-se. Would have to check with ifort/ifx to cross check.

Answer 3 · 2025-01-20T17:14:12.000Z

It would seem like a bug with gfortran, because printing B directly returns the correct result at the max location, but when you print maxval, it returns a wrong value: compiler explorer.

A slightly deeper look highlights that if the single precision version of eye is used, the error goes away:

maxval(abs(B-eye(k,mold=1.0)))

so it looks like gfortran is mismatching something in the conversion (default eye is real(real64) since #902, was previously integer(int8))

Answer 4 · 2025-01-21T07:41:08.000Z

All our tests failed since shortly before Christmas despite not changing a single line in our CI. I had not seen #902, but I suppose this is the culprit. I'll fix our tests with the mold argument for the moment and will try to find some time over the next couple of days to have report the issue to gfortran.

Thanks.

Answer 5 · 2025-01-21T08:49:21.000Z

It really looks unrelated to stdlib: Compiler explorer. It seems the issue is only present if complex(sp) - real(dp), works with any other precision combinations. I will go ahead and post this example on the gcc bugzilla.

Now, @loiseaujc if you find a safe way (i.e., changing the default return type for eye) that this error goes away, I think it is worth implementing. Most users won't know that gfortran has an issue, and may be in trouble.

Answer 6 · 2025-01-21T11:00:59.000Z

It really looks unrelated to stdlib: Compiler explorer. It seems the issue is only present if complex(sp) - real(dp), works with any other precision combinations. I will go ahead and post this example on the gcc bugzilla.

Yep, came to the same conclusion indeed.

Now, @loiseaujc if you find a safe way (i.e., changing the default return type for eye) that this error goes away, I think it is worth implementing. Most users won't know that gfortran has an issue, and may be in trouble.

At the moment, I simply do B - eye(k, mold=1.0_sp) to force eye to return a single precision matrix. Works like a charm, albeit a tiny bit more verbose. Surprising thing though is that if I do norm2(abs(B - eye(k)), it works like a charm, even if eye(k) return a real(dp) and B is complex(sp). So it really seems that it is the combination maxval / abs / complex(sp) - real(dp).