The model code for the chrome extension (Sumz) that implements a sequence to sequence model for summarizing Amazon reviews, using Tensorflow 1.1 and the Amazon Fine Foods reviews dataset.
The seq2seq_model_building.ipynb
notebook walks through building and training a Sequence to sequence model with Tensorflow (version 1.1).
The model is currently used as the predictive backend for the Sumz chrome extension, which takes in Amazon reviews on the current web page and displays a small summary of each review. The model is trained on the the Amazon fine food reviews dataset. from Kaggle, which consists of 568K review-summary pairs.
This builds on the Text Summarization project by David Currie (this Medium post goes into excellent detail as well).
seq2seq model. source: WildML
Sequence-to-sequence models use two different RNNs, connected through the output state of the initial RNN. This is also called the encoder-decoder model (similar to Autoencoders). These seq2seq models are extremely powerful and versatile; they've been shown to have incredible performance a range of tasks including:
Task | Input | Output |
---|---|---|
Language translation | Text in language 1 | Text in language 2 |
News headlines | Text of news article | Short headline |
Question/Answering | Questions about content | Answers to questions |
Chatbots | Incoming chat to bot | Reply from chatbot |
Smart email replies | Email content | Reply to email |
Image captioning | Image | Caption describing image |
Speech to text | Raw audio | Text of audio |
For more information, here are some great resources:
We're using it here to 'translate' from a sequence of words (the entirety of an Amazon review) and another sequence of words (the short summary of the review).
The two notebooks (data_preprocessing.ipynb
and seq2seq_model_building.ipynb
) walk through the following steps in building the end-to-end system:
- Preprocessing the data: Exploring the Amazon reviews dataset, converting reviews strings into integer vectors, then building a word embeddings matrix from the vocabulary
- Building the model: Building the sequence-to-sequence model layer by layer using Tensorflow
- Training / testing the model: Feeding the preprocessed data into the model, and generating our own summaries to check out the model's inference performance
- Exporting the model for inference serving: Converting the model into a serialized Protobuff format to serve in a production environment (the chrome extension)
- Much credit to the Text-Summarization-with-Amazon-Review by Currie32
- The site WildML for extraordinarily helpful introductions to many of the key concepts in in the model
Copyright (c) 2017 Ashwin Kumar<ash.nkumar@gmail.com@gmail.com>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.