Titanic-dataset

Using the titanic data to predict the survival of the passengers. WorkFlow of the project (work still in progress)

Loading Libraries a. Numpy b. Pandas c. Matplotlib and seaborn d. sklearn for accuracy and algorithms with data-preprocessing purposes
Exploratory Data Analysis -Exploring the data like how many rows and columns shape of training and testing data, finding the missing values in the dataset

-Dummy encoding done on the categorical data.

-For Certain algorithms to work we must normalize the data so I have normalized using StandardScaler method

Training and Testing of Data importing KNN, GaussianNB, DecisionTree etc.. libraries, train_test_split library for model selection and to avoid overfitting of the model used.

Optional- Data Visualization tried making notebook more interactive

Work in Progress!! got 0.77 accuracy so far, will be improving it.

To get a better understanding of the workflow of a Machine Learning project, have a read:

digs1998/Titanic-Survivors-predictions