RuntimeError: Lapack_SVD(): dgelss failed

Question

RuntimeError: Lapack_SVD(): dgelss failed

Closed this issue 3 years ago · 12 comments

Im new to python and I have been using pyEDM to carry out ccm and Smap. The ccm step runs well but the smap function gives a run time error for only some species pairs.
RuntimeError: Lapack_SVD(): dgelss failed
My data do not include NAs
smap_sp1 = pyEDM.SMap(dataFrame=df, lib="1 100", pred="1 100", columns=sp1, target=sp2, E=int(best_embed_sp1), theta=int(best_theta_sp1), showPlot=False)
My data structure is as follows, the index contains date values, Sp. columns contain normalized count data
index sp.A. sp.B. sp.C. sp.D. sp. E.
1991-06-10 -0.04 -0.122. -0.064 -0.118 -0.0242
1991-06-19 -0.04 -0.256 -0.064 -0.110 -0.0121
I really need some help to sole this issue..

Answer 1 · 2022-03-07T01:09:38.000Z

Thank you for reporting this issue.

Please specify the platform and version.

Version can be shown as:

>>> import pyEDM
>>> pyEDM.__version__
'1.10.2.0'

Did you install from the PyPI pyEDM repository?

Answer 2 · 2022-03-07T06:48:03.000Z

Dear Sugihara lab, Python 3.9.10 (main, Jan 31 2022, 06:20:15) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux pyEDM.__version__ ‘1.10.1.0' Im running the code in a server and pyEDM was probably installed from the PyPI pyEDM repository Thank you so much for taking your time to resolve this matter

…

On Mar 7, 2022, at 10:09, Software Literacy Foundation ***@***.***> wrote: Thank you for reporting this issue. Please specify the platform and version. Version can be shown as: >>> import pyEDM >>> pyEDM.__version__ '1.10.2.0' Did you install from the PyPI pyEDM <https://pypi.org/project/pyEDM/> repository? — Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASTKPYSKGXXWCPCYN3XPXHTU6VJN5ANCNFSM5QB4CRLQ>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.

Answer 3 · 2022-03-07T11:05:29.000Z

This probably means that LAPACK is not installed or in the path.

On Ubuntu you can look for lapack as so:

dpkg -L liblapack3

On Redhat/CentOS, something like this should work:

sudo yum list installed | grep lapack

This seems a bit strange, as a workable linux environment with numpy or scipy would have blas/lapack.

>>> import numpy.distutils.system_info as sysinfo
>>> sysinfo.get_info('lapack')
{'libraries': ['lapack', 'lapack'], 'library_dirs': ['/usr/lib/x86_64-linux-gnu'], 'language': 'f77'}

> ls /usr/lib/x86_64-linux-gnu/liblapack*
/usr/lib/x86_64-linux-gnu/liblapack.a      /usr/lib/x86_64-linux-gnu/liblapack.so
/usr/lib/x86_64-linux-gnu/liblapack_pic.a  /usr/lib/x86_64-linux-gnu/liblapack.so.3

Answer 4 · 2022-03-09T05:34:27.000Z

I tried the same code in Jupyter notebook, and the same error appears The Smap codes runs well between some species columns but gives the Lapack_SVD(): dgelss failed error only when run between some species. So I doubt its a problem with the installation. Below is the error when I ran the code in Jupyter notebook RuntimeError Traceback (most recent call last) /var/folders/c4/bf2z4j757b9ccchl21h9hdq80000gn/T/ipykernel_14542/1152999121.py in <module> 40 theta=int(best_theta_sp1) 41 ''' ---> 42 smap_sp1 = pyEDM.SMap(dataFrame=df, lib="1 100", pred="1 100", columns=sp1, target=sp2, E=int(best_embed_sp1), theta=int(best_theta_sp1), showPlot=False) 43 coef_sp1 = smap_sp1['coefficients'] 44 interaction_stg_sp1 = coef_sp1[coef_sp1.columns[-1]].values ~/opt/anaconda3/lib/python3.9/site-packages/pyEDM/CoreEDM.py in SMap(pathIn, dataFile, dataFrame, pathOut, predictFile, lib, pred, E, Tp, knn, tau, theta, exclusionRadius, columns, target, smapFile, jacobians, solver, embedded, verbose, const_pred, showPlot, validLib, generateSteps, parameterList) 217 # D is a Python dict from pybind11 < cppEDM SMap: 218 # { "predictions" : {}, "coefficients" : {}, ["parameters" : {}] } --> 219 D = pyBindEDM.SMap( pathIn, 220 dataFile, 221 DF, RuntimeError: Lapack_SVD(): dgelss failed.

…

On Mar 7, 2022, at 20:05, Software Literacy Foundation ***@***.***> wrote: This probably means that LAPACK <http://www.netlib.org/lapack/explore-html/index.html> is not installed or in the path. On Ubuntu you can look for lapack as so: dpkg -L liblapack3 On Redhat/CentOS, something like this should work: sudo yum list installed | grep lapack This seems a bit strange, as a workable linux environment with numpy or scipy would have blas/lapack. >>> import numpy.distutils.system_info as sysinfo >>> sysinfo.get_info('lapack') {'libraries': ['lapack', 'lapack'], 'library_dirs': ['/usr/lib/x86_64-linux-gnu'], 'language': 'f77'} > ls /usr/lib/x86_64-linux-gnu/liblapack* /usr/lib/x86_64-linux-gnu/liblapack.a /usr/lib/x86_64-linux-gnu/liblapack.so /usr/lib/x86_64-linux-gnu/liblapack_pic.a /usr/lib/x86_64-linux-gnu/liblapack.so.3 — Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASTKPYSMCZPD4BDJFABW62DU6XPIHANCNFSM5QB4CRLQ>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.

Answer 5 · 2022-03-09T11:05:54.000Z

Good news it isn't an installation problem.

Any NaN's in the data? There are checks in the code to detect them since LAPACK doesn't handle them, but perhaps it is not catching all instances.

It seems the function call dgelss() from LAPACK is returning an error, and that raises an error with the message: Lapack_SVD(): dgelss failed. The returned error code is not reported in the error message. This needs to be fixed.

If the problem is not NaN in the data, the data may be ill-posed. Here is the error code message from LAPACK, which I presume you are engaging INFO > 0.

          INFO is INTEGER
          = 0:  successful exit
          < 0:  if INFO = -i, the i-th argument had an illegal value.
          > 0:  the algorithm for computing the SVD failed to converge;
                if INFO = i, i off-diagonal elements of an intermediate
                bidiagonal form did not converge to zero.

If that is the case, we can check this with an independent solution code.

Answer 6 · 2022-03-14T01:02:49.000Z

Could I get some help about the independent solution. How should I make an independent solution.. Ive tried the code with seperate species pairs. Even though the pyEDM.EmbedDimension, pyEDM.CCM functions works well pyEDM.SMap gives the error

…

On Mar 9, 2022, at 23:36, Ishara Perera ***@***.***> wrote: There are no Nan in the data. However the data set includes species counts and the counts were normalized to 0 mean and unit SD. Hence the data has both positive and negative integers.. but not 0 > On Mar 9, 2022, at 20:06, Software Literacy Foundation ***@***.***> wrote: > > > > Good news it isn't an installation problem. > > Any NaN's in the data? There are checks in the code to detect them since LAPACK doesn't handle them, but perhaps it is not catching all instances. > > It seems the function call dgelss() <x-msg://18/dgelss()> from LAPACK is returning an error, and that raises an error with the message: Lapack_SVD(): dgelss failed. The returned error code is not reported in the error message. This needs to be fixed. > > If the problem is not NaN in the data, the data may be ill-posed. Here is the error code message from LAPACK, which I presume you are engaging INFO > 0. > > INFO is INTEGER > = 0: successful exit > < 0: if INFO = -i, the i-th argument had an illegal value. > > 0: the algorithm for computing the SVD failed to converge; > if INFO = i, i off-diagonal elements of an intermediate > bidiagonal form did not converge to zero. > If that is the case, we can check this with an independent solution code. > > — > Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASTKPYRCF66R5PSLUMTTRVTU7CAZ5ANCNFSM5QB4CRLQ>. > Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. > You are receiving this because you authored the thread. >

Answer 7 · 2022-03-14T18:47:08.000Z

It sounds like the data (embedding and target) are producing an ill-conditioned problem for the SVD.

What value of theta are you using? A theta = 0 applies uniform weight instead of an exponential one.

I wonder if using the Ridge regression solver with regularization would help?

Here's an example you should be able to follow in your code.

>>> from pyEDM import *
>>> from sklearn.linear_model import Ridge
>>> solver = Ridge( alpha = 0.5 )
>>> sm = SMap( dataFrame = sampleData['circle'], lib = "1 100", pred = "101 198", embedded = True, E = 2, theta = 3.14, columns = "x y", target = "x", showPlot = True, solver = solver )

Answer 8 · 2022-03-22T01:12:27.000Z

Sorry for taking a long time to respond..
We have been using pyEDM.PredictNonlinear function for computation and got the theta value at maximum rho. theta is not equal to 0
I also tried the ridge regression solver but it did not help me with the occurring error.
But later I removed the theta parameter from the pyEDM.SMap function, then the code ran for all species even though the out put graphs were quite different from when theta is optimized. I wonder what type of theta values could give rise to such error in Lapack_SVD(): dgelss

Thank you for the assistance provided always

Answer 9 · 2022-03-23T06:01:50.000Z

Thanks for the feedback.

Using PredictNonlinear() with no theta argument will use a default set of theta values:

ThetaValues( { 0.01, 0.1, 0.3, 0.5, 0.75, 1, 1.5, 2, 3, 4, 5, 6, 7, 8, 9 } )

calling SMap for each theta. You suggest this works for the data in question.

Calling SMap with no theta parameter argument defaults to theta = 0. The global linear map. You suggest this also works for the data in question.

This seems to suggest that something else, not the value of theta, could be the problem?

Perhaps we should look at your function call parameters? I'm willing to look at the data as time allows.

Answer 10 · 2022-03-24T02:01:55.000Z

Im truly grateful for your help and feedback. Is there a way I could send you the pipeline code I am using and a sample data set causing the error other than this public forum?

Answer 11 · 2022-03-24T16:33:56.000Z

Thanks for the data and code. Almost certainly this is an ill-posed problem for the LAPACK dgelss solver.

>>> from pyEDM import *
>>> from pandas import read_csv
>>> df = read_csv('TestRun.csv')
>>> df.iloc[:, 1:4].quantile( [0.1, 0.25, 0.5, 0.75, 0.9], 'rows' )
        Sp_001    Sp_002
0.10 -0.048507 -0.122024
0.25 -0.048507 -0.122024
0.50 -0.048507 -0.122024
0.75 -0.048507 -0.122024
0.90 -0.048507 -0.122024

You are trying to solve a static problem with constant values. This works with theta = 0 since it is a multiple linear regression with unity weights.

I don't think SMap is the right tool to use on data with 1600 out of 1700 values as constant.

If you really insist, the computational problem is that the library is full of constant, equidistant points since you only use the first 100 points. Indeed:

>>> SMap( dataFrame = df.iloc[:, 1:4], columns = 'Sp_001', target = 'Sp_002', E = 5, 
          theta = 2.2, lib = '1 100', pred = '101 105' )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpark/.local/lib/python3.8/site-packages/pyEDM/CoreEDM.py", line 219, in SMap
    D = pyBindEDM.SMap( pathIn,
RuntimeError: Lapack_SVD(): dgelss failed.

With a library that covers all the data, there is enough variance to prevent singular explosion, but, I doubt the results are meaningful:

>>> SMap( dataFrame = df.iloc[:, 1:4], columns = 'Sp_001', target = 'Sp_002', E = 5, 
          theta = 2.2, lib = '1 1700', pred = '101 105' )
{'predictions':         index  Observations  Predictions  Pred_Variance
0  1993/07/19     -0.122024          NaN            NaN
1  1993/07/27     -0.122024     0.002114       1.016464
2  1993/07/28     -0.122024     0.002114       1.016464
3  1993/08/09     -0.122024     0.002114       1.016464
4  1993/08/18     -0.122024     0.002114       1.016464
5  1993/08/30     -0.122024     0.002114       1.016464, 

'coefficients':         index        C0  ...  ∂Sp_001(t-3)/∂Sp_002  ∂Sp_001(t-4)/∂Sp_002
0  1993/07/19       NaN  ...                   NaN                   NaN
1  1993/07/27  0.002089  ...             -0.000101             -0.000101
2  1993/07/28  0.002089  ...             -0.000101             -0.000101
3  1993/08/09  0.002089  ...             -0.000101             -0.000101
4  1993/08/18  0.002089  ...             -0.000101             -0.000101
5  1993/08/30  0.002089  ...             -0.000101             -0.000101

[6 rows x 7 columns]}

Answer 12 · 2022-10-11T09:17:06.000Z

There are no Nan in the data. However the data set includes species counts and the counts were normalized to 0 mean and unit SD. Hence the data has both positive and negative integers.. but not 0

…

On Mar 9, 2022, at 20:06, Software Literacy Foundation ***@***.***> wrote: Good news it isn't an installation problem. Any NaN's in the data? There are checks in the code to detect them since LAPACK doesn't handle them, but perhaps it is not catching all instances. It seems the function call dgelss() from LAPACK is returning an error, and that raises an error with the message: Lapack_SVD(): dgelss failed. The returned error code is not reported in the error message. This needs to be fixed. If the problem is not NaN in the data, the data may be ill-posed. Here is the error code message from LAPACK, which I presume you are engaging INFO > 0. INFO is INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value. > 0: the algorithm for computing the SVD failed to converge; if INFO = i, i off-diagonal elements of an intermediate bidiagonal form did not converge to zero. If that is the case, we can check this with an independent solution code. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.