When creating training / test sets, you'll have a few different ways to evaluate your scores:
If you create a train / test set and write them out to a file you can get a score with the following:
python evaluation.py ./y_true.txt ./y_pred.txt
And it will print out your score.
You can supply two python dictionaries to the evaluate
function:
from evaluation import evaluate
y_true = {
'ba986b9277b96cd6de07dd07be8362b67b764dd4': [12985, 304842, ...]
}
y_pred = {
'ba986b9277b96cd6de07dd07be8362b67b764dd4': [12985, 304842, ...]
}
# note that the order is VERY important!
evaluate(y_true, y_pred)
You can see how we are using the ml_metrics.mapk(actual, predicted, k=500)
by looking in evaluation.py if you need
further flexibility.
make a prediction for all users in data/test_users.csv
and upload a csv of it
You'll submit a csv with no header via the normal upload. The first entry must always be the ID of the user and the rest are song ids.
ba986b9277b96cd6de07dd07be8362b67b764dd4,12985,304842,...
42a9daa28f605e4f269711946cdbe0498a172706,217471,177172,...
...
You are required to produce predictions for every single user in data/test_users.csv
.