/Dental-Dataset-Challenge

EDA, visualization and Modeling of Dental Claims Data

Primary LanguageJupyter Notebook

About

Apart from the final results, we also want to see your work environment (raw data handling, code, etc) so be ready to share your screen and show us your work and results via zoom. I encourage you to ask questions during the interview, business or technical!

In the attached dataset, you’ll find dental claims on a Member level for a benefit year. A data_summary tab is available to provide general guidance on the features.

The desired outcome of this data challenge is to:

  • Perform exploratory data analysis
  • Find patterns and correlations where present
  • Train a machine learning model to predict the claim amount for the Group Benefit Member ‘sum_paid_amt_after_cob’
  • Explain the model with feature importance and feature impact

You can use any tool or program, however please keep in mind that we generally work in Python and present results and visualizations in Tableau (you can get a trial version if you choose to use Tableau). This is just a suggestion, any other tool (like Excel, Power Point, R) would also be acceptable.

Work

EDA Notebook: Binder

Modelling Notebook: Binder