BGG-fixed-ratings
BGG is the biggest board game (and not only) encyclopedia. If you want information about a tabletop game, that is the right place.
Thousands of users all over the world comment and review games so that other users can evaluate a game before buying it... But hundreds of comments are not rated by users and this makes the general review of a game not properly correct.
In this notebook we tried to give a rate to all those unrated comments and assigned to the hottest (top 50) games, a more accurate rating
BoardGameGeek: ratings and comments
In each page on https://boardgamegeek.com/, a detailed list of information is returned for each game, such as:
- Number of players,
- Playing time,
- Designer,
- …
Also, users can add their own reviews in order to share their thought about the game in subject or simple comments that do not have a rating but in some way are another instruments for the site’s users to let others know what they think about the game.
Our idea is to assign a hypothetical score to these comments in order to better understand the users’ preferences.
Assign a score to comments
What we thought about is to use all the ratings available on the platform (with text and score) in order to fine-tune a pre-trained model and use it to give a score to the un-rated comments
In this way learning on top of the ratings:
We can assign a score to simple comments like this:
Code key points
Data acquisition
Let's start collecting the hottest games in order to get some titles to get reviews and comments from with:
hot_array = get_hot_data()
hot_array[:2]
and then let's use this list to get corresponding reviews and comment using another utility function:
comments_df = get_comments(hot_array, verbose=10) # verbose=10 means print a row each 10 iterations
Data cleaning
Remove URLs from ratings/comments
comments_df['value'] = [re.sub(r"http\S+", "", v) for v in comments_df.value.values]
Remove comments under a specific length
comments_df = remove_short_comments(comments_df, MIN_COMMENT_LEN)
Datasets creation
Let's split rated and not-rated comments:
# get rated comments only
rated_comments = comments_df.query('rating != "N/A"')
# get non rated comments only
not_rated_comments = comments_df.query('rating == "N/A"').reset_index(drop=True)
Classifier training
We decided to use a scikit-learn wrapper in order to have access to the GridSearchCV method that performs a training based on Cross Validation check, in this way we can be sure that the performances we get are not influenced by the training/validation split
def build_classifier():
return build_model(hub_layer=None, pre_trained_model_name=MODEL_NAME, model_type='classifier', verbose=0)
estimator = KerasClassifier(build_fn=build_classifier, epochs=100, batch_size=1024, verbose=2, validation_split=VAL_FRACTION)
x_train_clf = np.array(list(rated_comments.value))
y_train_clf = np.array(list((rated_comments.rating.astype(float)>=GOOD_REVIEW_THRESHOLD).astype(int)))
clf = GridSearchCV(
estimator,
cv=3,
param_grid={}
)
clf.fit(x_train_clf, y_train_clf, callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=5, min_delta=0.001)])
The resulting model returned the following training charts:
and we can be quite confident saying it is a good model.
Regressor training
Let's now try to train a classifier instead using a very similar approach:
def build_regressor():
return build_model(hub_layer, pre_trained_model_name=MODEL_NAME, model_type='regressor', verbose=0)
estimator = KerasRegressor(build_fn=build_regressor, epochs=100, batch_size=512, verbose=0, validation_split=VAL_FRACTION)
x_train_reg = np.array(list(rated_comments.value))
y_train_reg = np.array(list(rated_comments.rating.astype(float)))
clf = GridSearchCV(
estimator,
cv=3,
param_grid={}
)
clf.fit(x_train_reg, y_train_reg, callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_mean_squared_error', patience=5, min_delta=0.001)])
that returns the following training chart (here loss and metric - mean squared error - match):
Model comparison
not_rated_comments = not_rated_comments.sample(frac=1)
inputs = list(not_rated_comments.value.astype(str))[:10]
clf_results = classifier.predict(inputs, verbose=0)
reg_results = regressor.predict(inputs, verbose=0)
for i in range(len(inputs)):
print(f"""\"{inputs[i]}\"
reg score: {reg_results[i]:.2f}
clf score: {clf_results[i][0]}
""")
looking at some comments and evaluating the scores assigned by the two models we can easily notice that regressor is a bit more accurate and the scores assigned are more reasonable. For this reasons we decided to continue the study with the regressor
Ratings weighting
Let's use the regressor to assign a rating to all the not-rated comments and now we are ready to combine original reviews with just scored comments.
We can now look at the new resulting top 5 with something like:
# TOP N FIXED RANK
display_topn(by='fixed_rating', n=TOP_N, ascending=False)
TODO:
exclude non-english comments/reviewsexclude very short comments/reviewsadded regressorcompare regressor vs classifierclip regressor results- LSTM in build_model
- find a better (faster) way to get all comments for hottest games
- test tensorflow/decision-forests