heitor57/poi-rss

Yelp Dataset 2021 version incompatibility

Opened this issue · 4 comments

In the latest version of the Yelp Open Dataset, the metropolitan areas were changed, so the cities_checkin_data dictionary doesn't get populated when running datasetgen.py. May I request for access to the same version of the dataset that was used during development? Thanks!

I'd love to do this, but from what I've seen of Yelp's terms of use, I'm not allowed to distribute their data. I want to contact Yelp soon for more information on sharing the dataset I used. If I get any good news, I'll share it here.

@kayeval finally, I got permission to share the dataset I used, you can download it from Google Drive, and let me know if you have any problems.

@heitor57 Apologies for the late reply, but thank you so much, I was able to access the dataset with no problems and was able to train a new model using it.

On another note, I was just wondering what steps I should take in order to add a new model as part of the existing recommenders (so I can run the same evaluation metrics on the new model)?

If you want to add a base recommender with the following steps below you should have no problems, if it's a post-processing model the process is extremely similar. I recommend that you consult the implemented recommenders to clarify other doubts.

Steps:
(1) Create the configurations in the RecRunner class to manage the inputs;
(2) In the RecRunner class, in the static method get_base_parameters, the recommendation model must expose its default parameters by a dictionary;
(3) Set and create a handler to execute the recommendation model in RecRunner BASE_RECOMMENDERS attribute;
(4) Set in CITIES_BEST_PARAMETERS dictionary at constants.py the parameters of the method to use in each dataset;
(5) Set in RECS_PRETTY a name of the recommender to appear in visualizations.