/Titanic-Analysis

My attempt at the kaggle titanic competition

Primary LanguageJupyter Notebook

About the competition

The sinking of the Titanic is one of the most infamous shipwrecks in history.

On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.

While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.

In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

More info at https://www.kaggle.com/c/titanic

Titanic-Analysis

Description

The notebook given contains a detailed analysis of the python dataset and achieves 78.229 accuracy which gave me a rank of 3651 out of 17781

File-Description

titanic.ipynb - The main notebook for the competition written in Python. Contains Exploratory Data Analysis, Data Preprocessing and Models applied on the dataset and their accuracy according to kaggle.

train.csv- The train dataset(Detailed description in notebook)

test.csv- The test dataset(Detailed description in notebook)

catboostsub.csv- The catboost model prediction file.

decisiontreesub.csv- The decision tree model prediction file.

lgbsub.csv- The LightGBM model prediction file.

logisticsub.csv- The logistic regression model prediction file.

randomforestsub.csv - The random forest model prediction file.

svmsub.csv - The support vector machine model prediciton file.

xgbsub.csv - The XGBoost model prediction file.

If you have ideas to make my attempt better, feel free to modify and help me