
This was my thesis project for my Bachelor's degree. It aimed to classify website accoring to the topic they're about, clustering them in a discrete amount of categories(News,Sports...). It was a combination of the following:

  • An XML parser
  • An Scrapper + Spider to go through all the websites and get their bare text
  • Several models of AI altogether with their scripts