In this project, I will go through the process of finding those factors that affect the survival rate prediction in the titanic disaster. I've to use the famous titanic dataset that available on Kaggle [https://www.kaggle.com/c/titanic/data].
- Which features could contribute the higher survival rate?
- What is the correlation between Age, Sex, Embarked and Pclass with survival?
- Which algorithm get the higher accuracy?
- sklearn
- numpy
- pandas
- matplotlib
- seaborn
For avoiding operating system warnings, have to import the 'os' library.
- pip install os (download)
- import os (importing)
- Import the Libraries
- Load the data
- Data Wrangling to find instghts of data
- Perform Data Pre-processing to handle missing values and categorical values
- Algorithm implemetation
- Algorithm Optimization
- Hypertuning increase the accuracy of algorithm
I've to write the blog post to understand the features survival rate and correlation between them. You may find it on medium. [https://medium.com/@silicon.smile1/factors-affect-the-survival-prediction-in-the-titanic-disaster-a0601ef6cce8]