This project focuses on text classification using various machine learning algorithms. The goal is to predict the category or label of a given text document based on its content. Three different algorithms, namely K-Nearest Neighbors (KNN), Multinomial Naive Bayes, and Random Forest, have been implemented and evaluated for this task.
The project utilizes the scikit-learn library in Python for machine learning and natural language processing (NLP) tasks. The pipeline functionality of scikit-learn is leveraged to build end-to-end workflows for data preprocessing, feature extraction, and model training.