yzhao062/MetaOD

Different optimization criteria for matrix factorization in the code and in the paper

VConchello opened this issue · 4 comments

In the paper, it’s said that MetaOD minimizes the sum of sDCG as the optimization criteria to factorise a matrix into latent factors (Section 3.4.1), but the code (core.py:91,156,166) is using the function ndcg_score, from sklearn.metrics, which is different than sDCG in some aspects.
And then for the gradient descent it uses the gradient of sDCG to find an optimum.
Is there any rationale for these changes?

The reason is more for numerical stability. You can replace that by dcg and the results should be almost the same (some numerical stability may mess up though).
image

So IDCG is more like a scaling factor but will not change the results of the ranking to my understanding.

Thank you for your answer. The scaling part is clear, I think, but looking at the formula:
image
First, this formula is using the sigmoid. And also the numerator is pretty different.
I'm not sure if these changes have an important impact on the evaluation of this metric.

That is because DCG is not differentiable...specifically log2 (i+1) -> i is an indicator function and has no derivatives. We use sigmoid to approximate it.

image

Okay, I understood that the approximation would be used for both the gradient and the function to be minimised, not just to find the gradient.
Thank you for the answer.