How to predict in sequential models ?
kshitijyad opened this issue · 3 comments
I am trying to predict the implicit sequential model.
data=[['useID','docid','timestamp']]
from sklearn import preprocessing
le_usr = preprocessing.LabelEncoder() # user encoder
le_itm = preprocessing.LabelEncoder() # item encoder
# shift item_ids with +1 (but not user_ids):
item_ids = (le_itm.fit_transform(data['docid'])+1).astype('int32')
user_ids = (le_usr.fit_transform(data['userID'])).astype('int32')
from spotlight.interactions import Interactions
implicit_interactions = Interactions(user_ids, item_ids, timestamps=data.timestamp)
from spotlight.cross_validation import user_based_train_test_split, random_train_test_split
train, test = user_based_train_test_split(implicit_interactions, 0.3)
Now I use the following code to train:
from spotlight.sequence.implicit import ImplicitSequenceModel
sequential_interaction = train.to_sequence()
implicit_sequence_model = ImplicitSequenceModel(use_cuda=False, n_iter=1, loss='bpr', representation='pooling')
implicit_sequence_model.fit(sequential_interaction, verbose=True)
But after this lets say i want to predict for "user_ids" (one hot encoded at the top) 1000, I dont know how to use the predict function.
As per my understanding, predict function should take the input of sequence for that user, but how to find out that sequence for that user?
sequential_interaction
when used with .sequences()
doesnt give me which row belongs to which user.
Can you please provide an example to use predict on this data ? Any example using implicit sequence model would be helpful.
Thanks
predictions = model.predict(ids)
item_ids= (-predictions).argsort()[:10] # last 10 items
print(item_ids)
print(predictions[item_ids])
In the above, how would we get the topk predictions per user? i.e for the last item in a sequence, I would want the top k predictions, not just the last x predictions?
Thanks!
The predict function only needs a (sorted) array of item indices as input to recommend the "next best items" given the input sequence.
It doesn't need any reference of the user_id or session_id, because it encodes the input sequence in the embedding space and then it matches its representation with the items one, using the dot product operator.
I am trying to predict the implicit sequential model.
data=[['useID','docid','timestamp']] from sklearn import preprocessing le_usr = preprocessing.LabelEncoder() # user encoder le_itm = preprocessing.LabelEncoder() # item encoder # shift item_ids with +1 (but not user_ids): item_ids = (le_itm.fit_transform(data['docid'])+1).astype('int32') user_ids = (le_usr.fit_transform(data['userID'])).astype('int32') from spotlight.interactions import Interactions implicit_interactions = Interactions(user_ids, item_ids, timestamps=data.timestamp) from spotlight.cross_validation import user_based_train_test_split, random_train_test_split train, test = user_based_train_test_split(implicit_interactions, 0.3)
Now I use the following code to train:
from spotlight.sequence.implicit import ImplicitSequenceModel
sequential_interaction = train.to_sequence() implicit_sequence_model = ImplicitSequenceModel(use_cuda=False, n_iter=1, loss='bpr', representation='pooling') implicit_sequence_model.fit(sequential_interaction, verbose=True)
But after this lets say i want to predict for "user_ids" (one hot encoded at the top) 1000, I dont know how to use the predict function.
As per my understanding, predict function should take the input of sequence for that user, but how to find out that sequence for that user?
sequential_interaction
when used with.sequences()
doesnt give me which row belongs to which user.Can you please provide an example to use predict on this data ? Any example using implicit sequence model would be helpful.
Thanks