Yelp Dataset 2021 version incompatibility

Question

Yelp Dataset 2021 version incompatibility

Opened this issue 3 years ago · 4 comments

In the latest version of the Yelp Open Dataset, the metropolitan areas were changed, so the cities_checkin_data dictionary doesn't get populated when running datasetgen.py. May I request for access to the same version of the dataset that was used during development? Thanks!

Answer 1 · 2021-09-09T20:25:57.000Z

I'd love to do this, but from what I've seen of Yelp's terms of use, I'm not allowed to distribute their data. I want to contact Yelp soon for more information on sharing the dataset I used. If I get any good news, I'll share it here.

Answer 2 · 2021-09-14T02:08:33.000Z

@kayeval finally, I got permission to share the dataset I used, you can download it from Google Drive, and let me know if you have any problems.

Answer 3 · 2021-10-20T03:24:40.000Z

@heitor57 Apologies for the late reply, but thank you so much, I was able to access the dataset with no problems and was able to train a new model using it.

On another note, I was just wondering what steps I should take in order to add a new model as part of the existing recommenders (so I can run the same evaluation metrics on the new model)?

Answer 4 · 2021-10-20T04:20:26.000Z

If you want to add a base recommender with the following steps below you should have no problems, if it's a post-processing model the process is extremely similar. I recommend that you consult the implemented recommenders to clarify other doubts.

Steps:
(1) Create the configurations in the RecRunner class to manage the inputs;
(2) In the RecRunner class, in the static method get_base_parameters, the recommendation model must expose its default parameters by a dictionary;
(3) Set and create a handler to execute the recommendation model in RecRunner BASE_RECOMMENDERS attribute;
(4) Set in CITIES_BEST_PARAMETERS dictionary at constants.py the parameters of the method to use in each dataset;
(5) Set in RECS_PRETTY a name of the recommender to appear in visualizations.