code-terminator/invariant_rationalization

A general response regarding the beer review dataset

zhangyangbill opened this issue · 1 comments

In our ICML paper, we relied on two datasets to demonstrate the effectiveness of our method: an IMDB dataset, and a collection of reviews from the BeerAdvocate website. Both datasets have been used multiple times by the community, and they provided straightforward raw material to demonstrate our approach.

Unfortunately, since we started preparing our code release, it has come to our attention that the original beer review dataset was removed by the dataset’s original author, at the request of the data owner, BeerAdvocate (Please refer to this link: http://snap.stanford.edu/data/web-BeerAdvocate.html). This was an unexpected circumstance, which we noticed during the preparation of our code release and internal review. Therefore, out of respect for the property rights and wishes of the original data owner, we do not include the dataset in our Github repo. We have made it clear on the README that we remain happy to provide assistance to generate data partition indices for the dataset, for any user who has rights to the data.

Dear Zhang,

Fine with not providing this dataset then, but could you please share the codes and procedure publicly for reproducing your results on this dataset imagining the datasets are obtained from the original authors. You can simply provide the codes and ask people to provide the links of the datasets in your codes to be able to run them.

Also, I do disagree your IMDB experiment is of the same importance, this is just a synthetic dataset experiment, and beer review results are the main results of your work.