DrTimothyAldenDavis/SuiteSparse

cholmod_solve2() with Bset reports a too-sparse Xset and thus an incomplete solution (edit: this is not a bug, but feature of Xset / Bset)

Opened this issue · 10 comments

Hi. This is a clearer discussion of #891.

I'm solving a very small linear system JtJ x = b where b = [5,0,0]. I do this twice with a cholmod_solve2(CHOLMOD_A) call: once without Bset (telling CHOLMOD to use the whole dense b vector) and again with Bset (telling CHOLMOD that the first element of b is the only non-zero element).

Without Bset this works properly: in the attached sample this produces a vector x with all non-zero entries. But with Bset the reported vector x is only correct in the first two entries; the reported Xset vector says that x is sparse with the 3rd entry 0. But this is wrong: all entries should be non-zero.

Now the reproducer. Here's a tiny Python program to densely solve this problem as a reference:

#!/usr/bin/python3

import sys
import numpy as np
import numpysane as nps

J = np.array((
    (1,2,0),
    (0,3,7),
    (5,9,0),
    (1,1,0),
    ),
    dtype=float)
bt = np.array((5,0,0),
              dtype=float)

JtJ = nps.matmult(J.T,J)
print(np.linalg.solve(JtJ,bt))

This reports the correct solution:

[ 23.88888889 -13.33333333   5.71428571]

Here's a C program solving the same problem with cholmod_solve2(): cholmod-solve2-with-bset.c.txt

I build it like this:

gcc -I/usr/include/suitesparse cholmod-solve2-with-bset.c -o cholmod-solve2-with-bset -lcholmod

I run it without arguments to solve with a dense b (no Bset):

dima@shorty:/tmp$ ./cholmod-solve2-with-bset
No argument given: solving WITHOUT Bset

CHOLMOD dense:   X:  3-by-1, 
  leading dimension 3, nzmax 3, real, double
  col 0:
         0: 23.889 
         1: -13.333 
         2: 5.7143 
  OK

Note that it computed the correct answer. And I run it again with an argument, to solve with a sparse b:

dima@shorty:/tmp$ ./cholmod-solve2-with-bset xxx
Argument given: solving WITH Bset

CHOLMOD dense:   X:  3-by-1, 
  leading dimension 3, nzmax 3, real, double
  col 0:
         0: 23.889 
         1: -13.333 
         2: 3.9525e-323 
  OK


CHOLMOD sparse:  Xset:  3-by-1, nz 2, up/lo.
  nzmax 3, unsorted, packed, 
  scalar types: int32_t, pattern, double
  col 0: nz 2 start 0 end 2:
         0:
         1:
  nnz on diagonal: 1
  OK

Note that the last value of 5.7143 is missing.

This is on Debian with libcholmod5=1:7.8.3+dfsg-2

Thanks.

This is not a bug, but a design feature in how Bset and Xset work. With Bset and Xset, CHOLMOD is giving you a requested subset of the solution, not the whole solution. That's the purpose of Xset and Bset.

Xset is not saying "the solution is sparse; here is the pattern of X". It is saying "here are the entries of X that I computed; the other entries outside of Xset have not been computed, even if they are present".

The user guide states:

The solution {\tt X}
(a dense column vector) is modified on output, but is defined only in the rows
defined by the sparse vector {\tt Xset}.

and also

The entries in
{\tt Bset} are a subset of {\tt Xset}

In general, for most linear systems, X is a full vector. The goal of the Bset / Xset method is that a subset of X can be computed quickly, without computing all of X. With the Xset output, I tell you which components of X I have computed.

Xset is determined via a traversal of the elimination tree, starting from nodes in Bset. Xset always includes Bset, so if you want X(3) in this case, you need to set Bset(3).

In otherwords, if you want a subset of X, put the list of entries in X you want to compute into the set Bset. Then Xset will be guaranteed to include all those entries.

Xset is larger than Bset because to compute all entries requested (all X(i) for i in Bset), I have to compute some entries in X outside of Bset. The entries I compute in X are all X(i) for all i in Xset.

I can try to clarify the User Guide to make this more clear.

Hi. Thank you very much for clarifying. So then there's no obvious way to solve JtJ x = b for a very sparse b, right? Any suggestions for quickly solving many bt inv(JtJ) b problems for many different sparse b? I can get a 2x speed improvement by computing norm2( inv(L b) ) (details about P and D omitted), but it's still fairly slow if I can't assume a sparse b.

Thanks.

If the graph of JtJ consists of a single connected component, then x = JtJ \ b is dense, whatever the sparsity pattern of b. Do you know if that is the case for your JtJ matrix? If so, then there's nothing you can do; x will be dense. The only thing possible is to not solve for all of x (if that's useful), which is what the goal of the Xset / Bset process is.

I'll leave this issue open, by the way, until I revise the user guide.

Hi. In my case x is definitely dense. But I was hoping I could still get a much faster solve because

  • b is very sparse
  • I don't need inv(JtJ) b but rather bt inv(JtJ) b

Thank you for updating the docs. This will be helpful for people in the future.

If x = JtJ \ b, then you need y = bt * x, where bt is a sparse row vector, correct? So if Bset is the sparsity of bt, don't you only need x(i) for all i in Bset?

If so, then cholmod_solve2 will compute that. It will compute more of x, as it needs to (and returns that set in Xset), but it will compute all x(i) for all i in Bset.

Yes... That is absolutely correct, and should work ok. Thanks for the suggestion! I think once I discovered this "bug", I lost faith that I understood what cholmod was doing, and stopped investigating. Thanks you very much!

Awesome!

This use case is exactly what I put the Xset / Bset method in place for.

I think this has to be the best "bug" I've seen .. the fix is "This is not a bug, but a specific feature you want to exploit to greatly speed up your code."

Hi. I worked on this a bit just now, and have another related question if that's ok. Before using Bset, I could make my computation much faster by solving L x = b instead of JtJ x = b: xt inv(JtJ) x = xt inv(Lt L) x = xt inv(L) inv(Lt) x = norm2( inv(Lt) x ). So instead of solving JtJ x = b, which would need to solve two different L x = b problems, I could solve one L x = b problem, and compute the norm of the result. In my use cases this made my things 2x faster. I'm omitting uninteresting details about P and D here.

So now I'm trying to use Bset with cholmod_solve2(CHOLMOD_L, ...) now, and it doesn't work at all: the solution to L x = b I'm getting is all zero. I might be doing something wrong, but before debugging deeper, I'd like to ask you: can this work at all? Would you expect it to?

Thanks.

So to answer my own question, this is possible, but

  • cholmod_solve2() would need to return Yset to indicate the sparsity pattern of L \ b. Currently Yset is computed inside cholmod_solve2(), but not returned
  • I would ideally want a new cholmod_solve2() system (let's call it sys=CHOLMOD_LP) to apply the permutation, solve L \ b, return Yset, and not necessarily apply the reverse permutation (if I'm computing norm2(L \ b), the reverse permutation isn't needed.
  • xt inv(JtJ) x = norm2(inv(Lt) x) only if we have an LLt factorization, not an LDLt factorization. I can force this by setting common.final_ll = 1;. Is there any down-side to doing that?

I can give you patches to implement some of this, if you let me know what the preferred approach is. Is there a better forum for this discussion? Should I move this back to the "github discussions"? Thanks.