Predict for new data.
vi3k6i5 opened this issue · 5 comments
Say I trained the model with
fm.run(train_x, train_y, val_x, val_y)
How do i run prediction for another dataset?
pred_y = fm.run(test_x)
run method expects y_test
as input, Which doesn't make sense at all.
run(self, x_train, y_train, x_test, y_test, x_validation_set=None, y_validation_set=None, meta=None)
-
Regarding
y_test
as input:
libfm
uses the test values to output some results regarding its predictions. They are not used when training the model. If I'm not mistaken, you could actually set them to a dummy value and just collect the predictions (just disregard the prediction statistics since those will be wrong). For more info check libfm manual. -
Regarding running against a new dataset, at this moment you can't. Its a limitation from
libfm
itself. You have to train again. See this issue #7
Hope it helps,
Thanks :) Please add the same in README maybe. Might help others.
I'm so busy atm that I can't even breathe!
Feel free to PR that change ;)
Done. #21
Will try to make changes for #7 also. Any tips on how I should approach that problem ??
I was thinking of saving the trained model in a file. and keeping the reference object inside the model object. Adding a method named predict and predicting with that model object.
Doesn't seem clean, but it's a quick hack. Let me know.
The better approach would be something supported from the original libfm repo. But I think that's not going to happen... They even removed support for the save/load model on the MCMC method.
I guess we could try your approach. Just remember to correctly clean temporary files, those can be quite big when dealing with large datasets.
Kudos for tackling this!