Deepthi Girish Singapura
Varsha Chandan Bellara
Srushti Singatagere Basavaraj
Shiva Shankar Bidadi Nanjundaswamy
Title : Extractive Text Summarization with Lexical Chain, Modified Text Rank, Text Rank and TConspectus for news articles.
Raw Data Sets to be Downloaded from : http://mlg.ucd.ie/datasets/bbc.html
Article Tokenization
Case folding of token
Stop word removal
Lemmatization, Remove non alpha-numeric characters
LexRank:
python3 main.py
PageRank:
python3 main.py
Lexical Chain:
python3 main.py
Hybrid:
python3 summarizer.py
Sumy:
python3 extractSummary.py
Evaluate Summaries:
python3 documentComparator.py
Generated summaries based on compression ratio
Expressed documents as vectors
Compared all the Algo-generated-summaries using cosine values
LexRank : 52.57 %
Pagerank : 51.18 %
Lexical Chain: 74.36 %
Hybrid Algorithm : 61.45 %
LexRank : 63.12 %
Pagerank : 61.91 %
Lexical Chain: 79.95 %
Hybrid Algorithm : 70.86 %
LexRank : 72.31 %
Pagerank : 71.11 %
Lexical Chain: 84.21 %
Hybrid Algorithm : 77.91 %
LexRank : 78.59 %
Pagerank : 78.13 %
Lexical Chain: 86.64 %
Hybrid Algorithm : 83.87 %