The Titanic challenge on Kaggle is a competition in which the task is to predict the survival or the death of a given passenger based on a set of variables describing him such as his age, his sex, or his passenger class on the boat.
In a form of a jupyter notebook, my solution goes through the basic steps of a data science pipeline:
- Exploratory data analysis with visualizations
- Data cleaning
- Feature engineering
- Modeling