RetrievalNormalizedDCG doesn't change with different top_k values

Question

RetrievalNormalizedDCG doesn't change with different top_k values

igor17400 opened this issue 6 months ago · 2 comments

🐛 Bug

I'm trying to use the interface RetrievalNormalizedDCG however it seems that the computed value doesn't change with different top_k parameter values.

To Reproduce

Just initialize a python terminal with the following code:

from torch import tensor
from torchmetrics.retrieval import RetrievalNormalizedDCG
indexes = tensor([0, 0, 0, 1, 1, 1, 1])
preds = tensor([0.2, 0.3, 0.5, 0.1, 0.3, 0.5, 0.2])
target = tensor([False, False, True, False, True, False, True])
ndcg = RetrievalNormalizedDCG()
ndcg(preds, target, indexes=indexes)

Output: tensor(0.8467)

Now, just modify the ndcg instance for top_k = 5:

ndcg = RetrievalNormalizedDCG(top_k=5)
ndcg(preds, target, indexes=indexes)

Output: tensor(0.8467)

Nexty, just modify the ndcg instance for top_k = 10:

ndcg = RetrievalNormalizedDCG(top_k=10)
ndcg(preds, target, indexes=indexes)

Output: tensor(0.8467)

Expected behavior

The expected behavior would be different scores for different values of top_k.

Environment

TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source):

Name: torchmetrics
Version: 1.2.1
Summary: PyTorch native Metrics
Home-page: https://github.com/Lightning-AI/torchmetrics
Author: Lightning-AI et al.
Author-email: name@pytorchlightning.ai
License: Apache-2.0
Requires: lightning-utilities, numpy, packaging, torch
Required-by: lightning, pytorch-lightning

Python & PyTorch Version (e.g., 1.0):

Python 3.10.13
Pytorch 2.2.0.dev20231211

Any other relevant information such as OS (e.g., Linux):

ProductName:            macOS
ProductVersion:         14.2.1

Answer 1 · 2024-03-14T05:54:30.000Z

Hi! thanks for your contribution!, great first issue!

Answer 2 · 2024-03-14T06:03:03.000Z

Just noticed it was because of the number of samples 😅

The following example should solve

preds = tensor([0.2, 0.3, 0.5, 0.1, 0.3, 0.5, 0.2, 0.1, 0.2, 0.5, 0.1, 0.3, 0.5, 0.1])
indexes = tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
target = tensor([False, False, True, False, True, False, True, False, True, True, True, False, True, True])

Default `top_k`

ndcg = RetrievalNormalizedDCG()
print(f'top_k=default: {ndcg(preds, target, indexes=indexes)}')

output:
top_k=default: 0.903104305267334

`top_k = 5`

ndcg = RetrievalNormalizedDCG(top_k=5)
print(f'top_k=5: {ndcg(preds, target, indexes=indexes)}')

output:
top_k=5: 0.7864415645599365

`top_k = 10`

ndcg = RetrievalNormalizedDCG(top_k=10)
print(f'top_k=10: {ndcg(preds, target, indexes=indexes)}')

output:
top_k=10: 0.903104305267334

🐛 Bug

To Reproduce

Expected behavior

Environment

Default top_k

top_k = 5

top_k = 10

Default `top_k`

`top_k = 5`

`top_k = 10`