/seq2seq-keyphrase-dev

development branch

Primary LanguagePythonMIT LicenseMIT

seq2seq-keyphrase

Note: This is a development branch. Please check out this [repo] for the release version of seq2seq-keyphrase.

Introduction

This is an implementation of Deep Keyphrase Generation [PDF] [arXiv].

Data

The KE20k dataset is released in JSON format. Please download here. Each data point contains the title, abstract and keywords of a paper.

Part #(data)
Training 530,803
Validation 20,000
Test 20,000

The raw dataset (without filtering noisy data) is also provided. Please download here.

Well-trained model and other datasets will be released soon.

Cite

If you use the code or datasets, please cite the following paper:

Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky and Yu Chi. Deep Keyphrase Generation. 55th Annual Meeting of Association for Computational Linguistics. [PDF] [arXiv]