jfloff/pywFM

Predict for new data.

vi3k6i5 opened this issue · 5 comments

Say I trained the model with

fm.run(train_x, train_y, val_x, val_y)

How do i run prediction for another dataset?

pred_y = fm.run(test_x)

run method expects y_test as input, Which doesn't make sense at all.
run(self, x_train, y_train, x_test, y_test, x_validation_set=None, y_validation_set=None, meta=None)

  1. Regarding y_test as input:
    libfm uses the test values to output some results regarding its predictions. They are not used when training the model. If I'm not mistaken, you could actually set them to a dummy value and just collect the predictions (just disregard the prediction statistics since those will be wrong). For more info check libfm manual.

  2. Regarding running against a new dataset, at this moment you can't. Its a limitation from libfm itself. You have to train again. See this issue #7

Hope it helps,

Thanks :) Please add the same in README maybe. Might help others.

I'm so busy atm that I can't even breathe!
Feel free to PR that change ;)

Done. #21

Will try to make changes for #7 also. Any tips on how I should approach that problem ??

I was thinking of saving the trained model in a file. and keeping the reference object inside the model object. Adding a method named predict and predicting with that model object.

Doesn't seem clean, but it's a quick hack. Let me know.

The better approach would be something supported from the original libfm repo. But I think that's not going to happen... They even removed support for the save/load model on the MCMC method.

I guess we could try your approach. Just remember to correctly clean temporary files, those can be quite big when dealing with large datasets.

Kudos for tackling this!