This repository contains the dataset and the source code for the EMNLP 2019 paper "A Neural Citation Count Prediction Model based on Peer Review Text"[1].
In recent years, the number of scientific publications has been growing in a dramatic rate. Given the huge volume of scholarly papers, a long-standing research challenge is how to effectively evaluate the impact of scientific literature. A typical way to measure the impact of a scholarly paper is through the number of citations received after publication, reflecting the influence in the research community. Peer review is a widely adopted paper evaluation mechanism, in which three or more reviewers would be assigned to decide whether to accept or reject a paper. During the review process, the reviewers should assess the paper quality in terms of several important factors, including originality, correctness, substance and readability. Intuitively, peer review data should be useful to predict future impact of a paper, since the review text contains assessment comments from domain experts. To address the need for predicting citation count based on peer reviews, we present this dataset.
We present the statistics of the linked dataset in the following table:
By using the datasets, you must agree to be bound by the terms of the following license.
- Then mail to [lisiqing@ruc.edu.cn] and cc Wayne Xin Zhao via [batmanfly@gmail.com] and your supervisor, and copy the license in the email. We will send you the datasets by e-mail when approved.
By using the datasets, you must agree to be bound by the terms of the following license.
License agreement
This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree:
1. That the dataset comes “AS IS”, without express or implied warranty. Although every effort has been made to ensure accuracy, we do not accept any responsibility for errors or omissions.
2. That you include a reference to the dataset in any work that makes use of the dataset. For research papers, cite our preferred publication as listed on our References; for other media cite our preferred publication as listed on our website or link to the dataset website.
3. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character.
4. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
5. That all rights not expressly granted to you are reserved by us (Wayne Xin Zhao, School of Information, Renmin University of China).
If you use our dataset or useful in your research, please kindly cite our papers.
@inproceedings{lisiqing2019,
title={A Neural Citation Count Prediction Model based on Peer Review Text},
author={Siqing Li, Wayne Xin Zhao, Eddy Jing Yin and Ji-Rong Wen},
booktitle={EMNLP},
year={2019}
}
- The following people contributed to this work: Siqing Li, Wayne Xin Zhao, Eddy Jing Yin and Ji-Rong Wen.
- If you have any questions or suggestions with this dataset, please kindly let us know. Our goal is to make the dataset reliable and useful for the community.
- For contact, send email to [lisiqing@ruc.edu.cn], and cc Wayne Xin Zhao via [batmanfly@gmail.com].