Rutgers CS 2019 spring NLP course project
Final Presentation: google sides link
- first presentation: Can we summarize Reddit post?
- Autotldr is a bot that uses SMMRY to automatically summarize long reddit submission.
- text compactor tool
- TL;DR The abstractive summarization challenge. Good dataset to use! An on-going challenge.
- What is the state of text summarization research?.
- Datasets for text document summarization?
- A Quick Introduction to Text Summarization in Machine Learning. Described the types of techniques.
- How to Clean Text for Machine Learning with Python .
- Attention in Long Short-Term Memory Recurrent Neural Networks
- A Brief Overview of Attention Mechanism. It has good equations.
- Attention? Attention!. It has good equations, and introduces a family of attention mechanism.
- DeepInf: Social Influence Prediction with Deep Learning, A very good paper to understand attension mechanism).
- Graph Attension Networks
- Keras Attention Mechanism
- Neural Machine Translation by Jointly Learning to Align and Translate
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- Youtube video: C5W3L08 Attention Model, short but very useful attention mechanism tutorial.
- How to Develop an Encoder-Decoder Model with Attention for Sequence-to-Sequence Prediction in Keras
- attention mechanism, a blog.
- Another keras attention implementation with blog.
- Attention Mechanisms in Recurrent Neural Networks (RNNs) - IGGG, a one-hour video.
- What is a Transformer?
- The Illustrated Transformer
- Transformer — Attention is all you need
- Paper-with-code: Attention Is All You Need
- Codes for Transformer, BERT, etc.
- Attention Is All You Need — Transformer
- The Annotated Transformer
- Encoder-Decoder Models for Text Summarization in Keras, code.
- Text Summarization Using Keras Models
- tensor2tensor
- fairseq
- A ten-minute introduction to sequence-to-sequence learning in Keras
- Keras exmaple code: English to French
- ml-notebooks
- Keras BERT
- keras-seq2seq-with-attention Note - Tensorflow 1.13 and greater versions currently have problems with the code.
- Regarding hidden state(carry), cell state(memory): Hidden state is overall state of what we have seen so far. Cell state is selective memory of the past. The hidden state (h) carries the information about what an RNN cell has seen over the time and supply it to the present time such that a loss function is not just dependent upon the data it is seeing in this time instant, but also, data it has seen historically. link.
- Understand the Difference Between Return Sequences and Return States for LSTMs in Keras
- Without attention (in translation task):
- Words that only appear once or twice in the training data get mis-translated. (Not enough data)
- Words with locality difference between input and output sentences get mis-translated. E.g. in English a word appears at the start of the sentence while in Spanish it appears at the end.
- The dataset contains many sentences with different translations. These will always incur errors in our model.
- Attention is a concept which was designed to help fix this temporal limitation.
- Attention mechanism increases the computational burden of the model, but results in a more targeted and better-performing model.
- In addition, the attention model is also able to show how attention is paid to the input sequence when predicting the output sequence.
- ROUGE: TLDR challenge uses the F-1 scores accordingly for ROUGE-1, ROUGE-2 and ROUGE-LCS as quantitative evaluation.
- Usually, a qualitative evaluation will be performed through crowdsourcing. Human annotators will rate each candidate summary according to five linguistic qualities as suggested by the DUC guidelines. - Re-evaluating Automatic Metrics for Image Captioning: This paper has good explanation for BLEU, METEOR, ROUGE, and CIDEr.
- A very useful collection of state-of-art related work (since 2015) and their implementations.
- Abstractive Summarization of Reddit Posts with Multi-level Memory Networks. Dataset provided but no implementation.
- Get To The Point: Summarization with Pointer-Generator Networks
- Generating News Headlines with Recurrent Neural Networks
- Sequence to Sequence Learning with Neural Networks. Proposed Seq2Seq.
- A Neural Attention Model for Abstractive Sentence Summarization.
- An Improved Phrase-based Approach to Annotating and Summarizing Student Course Responses
- Attention mechanism helps.
- Comparison between character-level model and word-level model.
- Could we use hidden vector of the model to serve as embedding vector of the text, and further do other tasks like subreddit classification etc?
- LSTM methods require lots of training data.
- Should compare to a baseline model that is not so statistically intensive, like latent dirichlet allocation, as well as using a generic classification method like BERT that's not tuned to the particular text that you are working with.