linalg.pinv() instead of linalg.inv()?
eatdust opened this issue · 3 comments
Hello there,
I am usually using getCorrelatedVariable2DPlots() to obtain a list of the most correlated parameters to be plotted. Sometimes, parameters are fully correlated (especially derived ones), and I am accordingly warned:
WARNING:root:Parameters are 100% correlated: eps2, eps3
a 2D plot on these parameters would be a line, but crashes instead due to the inversion of a singular matrix.
Traceback (most recent call last):
File "...getdist/mcsamples.py", line 1720, in get2DDensityGridData
Cinv = np.linalg.inv(np.array([[ry ** 2, rx * ry * corr], [rx * ry * corr, rx ** 2]]))
File "<__array_function__ internals>", line 5, in inv
File "/usr/lib64/python3.8/site-packages/numpy/linalg/linalg.py", line 546, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
File "/usr/lib64/python3.8/site-packages/numpy/linalg/linalg.py", line 88, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
I have tried to replace all calls to inv() by the Moore-Penrose inverse, pinv(), and this actually works fine. In spite of the warning, the 2D plot does not crash and it displays the expected 100% correlated line. Would it make sense to default to pinv?
https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html?highlight=pinv
This doesn't usually actually crash for me, I wonder if they changed something.
I think rather than using pinv, could just threshold corr here to have an absolution value bounded by max_corr_2D, as it does when estimating the bin density? [Of course you never actually get a line out of these smoothed KDE estimates anyway]
yes, that works too. I'll push a merge request for that and the other issue I am going to open soon.
Damned, I was still getting a crash, very rarely though, even after the fix applied. The reason being that, sometimes, round-off errors give a correlation value that can be 1 + epsilon :(
print(corr)
1.0000000000000002
This condition is actually too strong:
if corr == 1:
logging.warning('Parameters are 100%% correlated: %s, %s', parx.name, pary.name)
corr = self.max_corr_2D
I propose instead something like
if abs(corr - 1.0) <= 1e-6
What do you think?