/Titanic-And-Spam-Challenge

I have applied supervised learning algorithms such as regression, decision trees, KNN, SVM, naive Bayes, random forests to solving business problems.

Primary LanguageJupyter Notebook

Titanic And Spam Project

I have applied supervised learning algorithms such as regression, decision trees, KNN, Bernoulli BayesNB, naive Bayes, XGBoost to solving business problem.

Defining the QUestion

Defining the Data Analytic Question

Randomly partition each dataset into two parts i.e 80 - 20 sets.

For dataset 1, because we don't have the label for the test set, we will use the train set to create train and test data(i.e splitting further), then performing thr K-nearet neighbour classification.

Recording the Experimental Design

Project 1: Predicting Survival in the Titanic Dataset

Loading the Dataset

Exploratory Data Analysis

Visualization

Data CLeaning

Modelling: K-Nearest Neighbours CLassifier(KNN)

Optimization Techniques for KNN

Challenging the Model: Using XGBOOST

Conclusion

Project 2: Predicting whether an email is a Spam or Not Spam

Loading the Dataset

Exploratory Data Analysis

Visualization

Modelling: Naive Bayes Classifier; GaussianNB

Optimizing techniques for Gaussian Naive Bayes Classifier

Modelling: Naive Bayes Classifier; Bernoulli BayesNB

Reccomendations

Challenging the Solution: XGBOOST

Conclusion