An analysis and prediction of survival from the popular beginner Titanic dataset from Kaggle. I used R to complete this challenge. I did some feature engineering and used a random forest model to predict survival for the test set. This was my first Kaggle submission and my first real data science project, so I wasn't really concentrating on scoring well and likely overfitted my model.
Achives a public test score of 76.55% (top 73%)
Rmd output: https://xorana.github.io/titanic_survival_analysis/
Source: https://www.kaggle.com/c/titanic