Sketching operator scaling in preconditioner generation

Here's our function for preconditioner generation for tall matrices:

RandLAPACK/RandLAPACK/comps/rl_preconditioners.hh

Lines 133 to 154 in 59f6284

    
           RandBLAS::RNGState<RNG> rpc_data_svd_saso( 
        
               Layout layout, 
        
               int64_t m, // number of rows in A 
        
               int64_t n, // number of columns in A 
        
               int64_t d, // number of rows in sketch of A 
        
               int64_t k, // number of nonzeros in each column of the sketching operator 
        
               T *A, // buffer of size at least m*n. 
        
               int64_t lda, // leading dimension for mat(A). 
        
               T *V_sk, // buffer of size at least d*n. 
        
               T *sigma_sk, //buffer of size at least n. 
        
               RandBLAS::RNGState<RNG> state 
        
           ) { 
        
               RandBLAS::SparseDist D{ 
        
                   .n_rows = d, 
        
                   .n_cols = m, 
        
                   .vec_nnz = k 
        
               }; 
        
               RandBLAS::SparseSkOp<T> S(D, state); 
        
               auto next_state = RandBLAS::fill_sparse(S); 
        
               rpc_data_svd(layout, m, n, A, lda, S, V_sk, sigma_sk); 
        
               return next_state; 
        
           }

.

The sparse sketching operator it constructs has nonzero entries in $\{+1,-1\}$. When we end up using that sketching operator we apply it with coefficient 1.0. That means the preconditioner we construct will only make the preconditioned data matrix have columns that are nearly orthogonal to one another, rather than nearly orthonormal. This is problematic regularized problems where the preconditioner needs to be updated with consideration to the value of the regularization parameter. To fix this we need to scale the nonzeros in the sketching operator to be $\pm 1/\sqrt{d}$ or we need to apply $S$ with $\alpha = 1/\sqrt{d}$.

	RandBLAS::RNGState<RNG> rpc_data_svd_saso(
	Layout layout,
	int64_t m, // number of rows in A
	int64_t n, // number of columns in A
	int64_t d, // number of rows in sketch of A
	int64_t k, // number of nonzeros in each column of the sketching operator
	T A, // buffer of size at least mn.
	int64_t lda, // leading dimension for mat(A).
	T V_sk, // buffer of size at least dn.
	T *sigma_sk, //buffer of size at least n.
	RandBLAS::RNGState<RNG> state
	) {
	RandBLAS::SparseDist D{
	.n_rows = d,
	.n_cols = m,
	.vec_nnz = k
	};
	RandBLAS::SparseSkOp<T> S(D, state);
	auto next_state = RandBLAS::fill_sparse(S);
	rpc_data_svd(layout, m, n, A, lda, S, V_sk, sigma_sk);
	return next_state;
	}