/Predicting-Tags-For-Stackoverflow

Suggest the tags based on the content that was there in the question posted on Stack Overflow. Used performance metric as micro F1 score and trained models by using One VS Rest Logistic Regression and Linear SVM

Primary LanguageJupyter Notebook

Predicting-Tags-For-Stackoverflow

Suggest the tags based on the content that was there in the question posted on Stackoverflow.

Screenshot (83)

Procedure :

1] We are modeling with less data points (0.5M data points) and more weight is given to the title.
2] We are limiting our tags to 500 only.
3] Due to the above steps we are reducing the time to train the model.
4] If we want to train the whole data we need high computational resource.
5] With 500 tags we are covering 90.956 % of questions.
6] When we apply OneVsRest Logistic regression on BOW we get macro F1 score as 0.3338.
7] Tfidf performs well than BOW on this dataset.