/LeafNATS

Learning Framework for Neural Abstractive Text Summarization

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

LeafNATS - A Learning Framework for Neural Abstractive Text Summarization

image License: GPL v3 image image image

This playground is a pytorch implementation of a learning framework for implementing different models for the neural abstractive text summarization and beyond. It is an extension of NATS toolkit, which is a toolkit for Neural Abstractive Text Summarization. The goal of this framework is to make it convinient to try out new ideas in abstractive text summarization and other language generation tasks.

Requirements

  • glob
  • argparse
  • shutil
  • spacy
  • pytorch 1.0

Usuage

Scripts

Dataset

We tested different models in LeafNATS on the following datasets. Here, we provide the link to CNN/Daily Mail dataset and data processing codes for Newsroom and Bytecup2018 datasets. The preprocessed data will be available upon request.

In the dataset, <s> and </s> is used to separate sentences. <sec> is used to separate summaries and articles. We did not use the json format because it takes more space and be difficult to transfer between servers.

Examples

LeafNATS is current under development. A simple way to run models that have already implemented is

  • Check: Check models we have implemented in this directory.

  • Import: In run.py, import the example you want to try. For example from nats.pointer_generator_network.main import *

  • Training: python run.py

  • Validate: python run.py --task validate

  • Test: python run.py --task beam

  • Rouge: python run.py --task rouge

Features

  • Engine Training frameworks
  • Playground Models, pipelines, loss functions, and data redirection
  • Modules Building blocks, beam search, word-copy for decoding
  • Data Data pre-process and batcher.

Pretrained Models and Results

Here is the pretrained model for our live system https://drive.google.com/open?id=1A7ODPpermwIHeRrnqvalT5zpr4BCTBi9

Indices of different models

Cannot Access

Experimental Results

Experimental Results can be found in paper

Citation

@article{shi2018neural,
  title={Neural Abstractive Text Summarization with Sequence-to-Sequence Models},
  author={Shi, Tian and Keneshloo, Yaser and Ramakrishnan, Naren and Reddy, Chandan K},
  journal={arXiv preprint arXiv:1812.02303},
  year={2018}
}
@inproceedings{shi2019leafnats,
  title={LeafNATS: An Open-Source Toolkit and Live Demo System for Neural Abstractive Text Summarization},
  author={Shi, Tian and Wang, Ping and Reddy, Chandan K},
  booktitle={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)},
  pages={66--71},
  year={2019}
}