Predict for new data.

Question

Predict for new data.

vi3k6i5 opened this issue 7 years ago · 5 comments

Say I trained the model with

fm.run(train_x, train_y, val_x, val_y)

How do i run prediction for another dataset?

pred_y = fm.run(test_x)

run method expects y_test as input, Which doesn't make sense at all.
run(self, x_train, y_train, x_test, y_test, x_validation_set=None, y_validation_set=None, meta=None)

Answer 1 · 2017-09-19T16:23:12.000Z

Regarding y_test as input:
libfm uses the test values to output some results regarding its predictions. They are not used when training the model. If I'm not mistaken, you could actually set them to a dummy value and just collect the predictions (just disregard the prediction statistics since those will be wrong). For more info check libfm manual.
Regarding running against a new dataset, at this moment you can't. Its a limitation from libfm itself. You have to train again. See this issue #7

Hope it helps,

Answer 2 · 2017-09-19T17:40:13.000Z

Thanks :) Please add the same in README maybe. Might help others.

Answer 3 · 2017-09-19T18:04:42.000Z

I'm so busy atm that I can't even breathe!
Feel free to PR that change ;)

Answer 4 · 2017-09-19T18:24:14.000Z

Done. #21

Will try to make changes for #7 also. Any tips on how I should approach that problem ??

I was thinking of saving the trained model in a file. and keeping the reference object inside the model object. Adding a method named predict and predicting with that model object.

Doesn't seem clean, but it's a quick hack. Let me know.

Answer 5 · 2017-09-20T10:42:55.000Z

The better approach would be something supported from the original libfm repo. But I think that's not going to happen... They even removed support for the save/load model on the MCMC method.

I guess we could try your approach. Just remember to correctly clean temporary files, those can be quite big when dealing with large datasets.

Kudos for tackling this!