SEASS-pytorch: A Python repository from cotitan

Pytorch implementation for paper "Selective Encoding for Abstractive Sentence Summarization" (Haven't finish yet)

author: Kirk
mail: cotitan@outlook.com

Requirments

pytorch==0.4.0
numpy==1.12.1+
python=3.5+

Data

Training and evaluation data for Gigaword is available https://drive.google.com/open?id=0B6N7tANPyVeBNmlSX19Ld2xDU1E

Training and evaluation data for CNN/DM is available https://s3.amazonaws.com/opennmt-models/Summary/cnndm.tar.gz

Noticement

we use another thread to preprocess a batch of data, which would not terminate after the main process terminate. So you need to press ctrl+c again to terminate the thread.

Directories:

.
├── Beam.py
├── Model.py
├── mytest.py
├── train.py
├── utils.py
├── sumdata/
|   ├── DUC2003/
|   ├── DUC2004/
|   ├── Giga/
|   ├── train/
|   └── vocab.json # will be built automatically if not exists
├── readme.md
├── log/
└── ckpts/

Make sure your project contains the folders above.

How-to

Run python train.py to train, it takes about 3.5h per epoch.
Run python mytest.py to generate summaries

TODO

learning rate decay, which is essential

cotitan/SEASS-pytorch

Pytorch implementation for paper "Selective Encoding for Abstractive Sentence Summarization" (Haven't finish yet)

Requirments

Data

Noticement

Directories:

How-to

TODO