[FEATURE REQUEST] batch prediction

Question

[FEATURE REQUEST] batch prediction

amirj opened this issue 6 years ago · 3 comments

I have a dataset contains 5,115,123 training samples and 1,278,781 test samples (train/test split ratio default=0.2).
It takes some minutes to train each model on GPU (100% Utilization) but when I run the following command:
mrr_baseline = mrr_score(model_baseline, dataset_test).mean()
It takes hours! and GPU utilizes for only 20%.
Why?
Do you have any suggestion to do the prediction faster?

Answer 1 · 2018-11-02T04:25:23.000Z

Having the same issue on certain models (e.g. bilinear neural network model).

Answer 2 · 2018-11-05T12:59:34.000Z

The problem is that model.predict(user_id) for each user is time-consuming check the source code.
@maciejkula Is there any way to get all item predictions for all users faster?
I mean, instead of iterating in a loop and get predictions for each user, do it for all users at one stage.
Current API doesn't let this type of predictions, if user_ids is an array, then it should be matched with item_ids.

Answer 3 · 2018-11-13T04:38:54.000Z

This issue is more pronounced for the bilinear models because they do very little computation per user. For this and many other reasons I strongly recommend using the sequence models.

@amirj the easiest solution for you is to write your own batched predict implementation.