We can take some text, analyse it, and generate a concise summary at the click of a button.
There's even some parameters to play with, so you can tweak it to your tastes, but, honestly,
the defaults work just fine.
This is a simple app to wrap three different text summarisation algorithms:
- CodePlex.OpenTextSummarizer
- Open Text Summarizer
- Text Rank
The app is written in:
- C#
- Blazor
- Dotnet Core
There is a database to authenticate users and track usage. Database creation scripts are provided for:
- Microsoft SQL Server
- SQLite
There are also various websites which do similar summarisations:
- https://www.splitbrain.org/services/ots
- https://deepai.org/machine-learning-model/summarization
- https://deepai.org/machine-learning-model/text-tagging
git clone https://github.com/TrevorDArcyEvans/themane.git
dotnet restore
dotnet build
cd Themane.Web
dotnet run
Navigate to http://localhost:5000/
Extractive summarisation works well; abstractive summarisation does not.
Text summarisation is generating an abstract or summary of an article. There are currently two main types of summarisation:
- extractive is where the most relevant/important sentences are taken from the article and used directly in the summary.
- abstractive is where the summary is written in much the same way that a human would write it. This requires understanding of both the subject matter and language.
Extractive summarisation is well known, well understood and works reasonably well. There are several implementations available and most previous research has been on this technique.
Abstractive summarisation is an emerging technique and is using artificial intelligence (AI) and machine learning (ML) methods. There has been a lot of recent activity, probably fuelled by the current interest in AI+ML. Whilst the technique shows a lot of promise, there are a lot of issues:
- deep understanding of AI
- very large training datasets
- resource intensive to train AI algorithms
- difficulty in training algorithms
- currently only works (at all) with short articles
A very small selection of articles:
- https://www.cambridge.org/core/journals/natural-language-engineering/article/natural-language-generation-the-commercial-state-of-the-art-in-2020/BA2417D73AF29F8073FF5B611CDEB97F
- https://techxplore.com/news/2020-11-ai-tool-lengthy-papers-sentence.html
- https://www.salesforce.com/products/einstein/ai-research/tl-dr-reinforced-model-abstractive-summarization
- https://www.theverge.com/2017/5/14/15637588/salesforce-algorithm-automatically-summarizes-text-machine-learning-ai
- https://rare-technologies.com/text-summarization-in-python-extractive-vs-abstractive-techniques-revisited/
- https://www.technologyreview.com/s/607828/an-algorithm-summarizes-lengthy-text-surprisingly-well
- https://ai.googleblog.com/2016/08/text-summarization-with-tensorflow.html
- https://heartbeat.fritz.ai/extractive-text-summarization-using-neural-networks-5845804c7701
- https://machinelearningmastery.com/gentle-introduction-text-summarization/
- https://github.com/mathsyouth/awesome-text-summarization
A search in google will no doubt yield other articles.