This project is created to analyze the performance of Naïve Bayes and Logistic Regression on 20newsgroups dataset.In this dataset, each document is a posting that was made to one of 20 different newsgroups.
I tested both classifiers for 10% train data, 30%, 50% and 100%. In the end I plotted my accuracy results on the test set on a graph (they are hardcoded)
This project is written on Jupyter Notebook. numpy, matplotlib.pyplot, seaborn, scikitlearn modules are used. To run the code you need Jupyter notebook. If it is not installed, you need to download Anaconda. Then, type "jupyter notebook" in the terminal.
Same code added in a .py format also