Build a model that predicts the probability that a driver will initiate an auto insurance claim in the next year.
Download data from the kaggle competiton and save in /data
.
- Understanding of the business problem to be solved
- Explanation for decisions made in formulating your solution
- Code quality
- Packaging of solution
- Evaluation of solution vs. baseline
Need some version of Python3.9
in $PythonPath
then use pipenv to install dependencies:
$PythonPath\python.exe -m pip install pipenv
set PIPENV_VENV_IN_PROJECT="enabled"
pipenv install -d
pipenv shell
cd src
To use a conda environment and make it discoverable by Jupyter (EDA notebook or if conda preferred over pipenv) run the following:
conda env update -f env.yml
conda activate ami
python -m ipykernel install --user --name ami --display-name "AMI (python 3.9)"
Run unit tests locally
python -m pytest tests/unit
Execute Pipeline.ipynb
to execute the Model building and Scoring notebooks.
** In order for scoring to work you'll need to run the Baseline model which takes over 2 hours.