Titanic data model

A pipeline to predict whether or not Titanic passengers survived. For a further dive into the problem space and data set, see kaggle.

This repo is both an opportunity to build a great model, and to explore what a great data pipeline can look like.

Quick start

# Set up Anaconda virtual environment
conda env create -f environment.yml --force

# Activate Anaconda virtual environment
source activate example

# Run code
cd bin/
python main.py

Repo structure

  • bin/main.py: Code entry point
  • conf/conf.py: Configuration file for project

Python Environment

Python code in this repo utilizes packages that are not part of the common library. To make sure you have all of the appropriate packages, please install Anaconda, and install the environment described in environment.yml (Instructions here, under Use environment from file, and Change environments (activate/deactivate)).

Contact

Feel free to contact me at 13herger <at> gmail <dot> com