W positive def matrix can be represented by cholesky decomposition 'for large enough k' ? claim FM can represent any interaction matrix.
Factorisation machines represent symmetric matrices with no diagonal specified.
In particular, we should be free to make the diagonals as large as we wish.
a symmetric matrix is positive definite if the diagonal is greater than the sum of all the off diagonal elements in the row. but the diagonal of our interaction matrix is undefined, so we can just define the diagonals to be greater than the sum of all the off diagonal row elements. https://en.wikipedia.org/wiki/Diagonally_dominant_matrix
So there exists a positive definite matrix with the same off diagonal elements. The question is whether this is a minimum norm solution
'a Hermitian matrix M M is positive semidefinite if and only if it is the Gram matrix of some vectors' https://en.wikipedia.org/wiki/Gram_matrix
So any
Question can we find
We assume we can rewrite as
Note that
So our design matrix is low rank...?!
also we don't therefore care about corresponding quadratic terms.
regularisation is
diagonals are
diagonal dominance requires $|V_{i}|^2 > \sum j |V{i}V_{j}|^2$. This 'sounds' impossible?
Given
? can we recover data generating process is model identifiable?
data available
amount of data is relatively small
coefficients (1 + (n_u + n_i) + (n_u n_i))
is symmetric matrix meaningful?
We split
then only coefficients we care about are interaction between single user (i) and single item (j) which is given by dot product of
instead can also think about converting dummy categories into continuous data.
Can either use gradient descent or ALS: optimising
- V^u
- V^i
- w
we have
this is not so many coefficients.
https://ojs.aaai.org/index.php/AAAI/article/view/4340
argues pos def ass is faulty
can say W is symmetric if consider non field aware matrix
The runtime complexity of calculating the prediction function 1 is
polynomial in the number of parameters, i.e.