##In this project following steps are planned:
-
The data collection is achieved via a crawler to get the news' archives from website using Scrapy
-
Data is formatted in csv format and Pandas can be used to manipulate the data.
-
Try different NLP algorithms to achieve the similarity using Gensim Library in Python
-
For now only LSI algorithm is implemented to achieve the similarity between news.