Recommending Web Articles

DataSet

JSON file with 3 columns (body, title, category) Each article falls into one of three categories:

In order to extract features from the text we can use (Bag of Words, N_grams or TF-IDF). And to cluster our categories into subcategories, we can either use the title, or the article body. So, I examined combinations of the above.
I could reach a classifier accuracy of 85% using TF-IDF on the body, and using the XGBoost Classifier.
For the Subcategory Clusters, I got different results, but saved those created using BOW on the titles, as I believe titles are more intuitive and representative in clustering, and I think BOW works best with titles than the TF-IDF method.

Built a Supervised Learning Model (Classifier) and achieved an accuracy of 85%, using the full article body.

Clustered each of the 3 category to SubCategories