Webpage similarity assessment
This program analyises the text in all the webpages of the Geoinsyssoft website and finds the similarities between the pages by a frequency analysis of common words. A similarity matrix is generated at the end. Uses Beautiful soup to do the webpage finding and Scikit-learn to do the frequency analysis