/MachineLearning-CW

CM2604 Machine Learning Coursework

Primary LanguageJupyter Notebook

MachineLearning-CW

CM2604 Machine Learning Coursework

--------------------Task(s) – content---------------------------------------

You are expected to perform a simple classification problem - that of labelling emails as spam or non-spam, based on their content in terms of words. The dataset has been taken from UCI Machine learning repository (https://archive.ics.uci.edu/ml/datasets/Spambase). This must be achieved using two machine learning models based on K Nearest Neighbors (KNN) and Decision Trees.

The meta information, class distribution, attributes, attribute statistics etc. of the corpus can be found in the provided link. Optimal strategies should be followed for preparing the dataset for the proposing models. Respective libraries, frameworks, tools etc. must be used for model implementation purposes. The implemented models should be compared based on the optimal evaluation metrices. Experimental results should be showcased for both model experimental settings.