seq2seq-keyphrase
Note: This is a development branch. Please check out this [repo] for the release version of seq2seq-keyphrase.
Introduction
This is an implementation of Deep Keyphrase Generation [PDF] [arXiv].
Data
The KE20k dataset is released in JSON format. Please download here. Each data point contains the title, abstract and keywords of a paper.
Part | #(data) |
---|---|
Training | 530,803 |
Validation | 20,000 |
Test | 20,000 |
The raw dataset (without filtering noisy data) is also provided. Please download here.
Well-trained model and other datasets will be released soon.
Cite
If you use the code or datasets, please cite the following paper:
Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky and Yu Chi. Deep Keyphrase Generation. 55th Annual Meeting of Association for Computational Linguistics. [PDF] [arXiv]