DATA CHALLENGE

The goal for us is to get an understanding of how you approach and think about problems, and how you work with data. While the deliverable includes a machine learning model, the evaluation is much deeper than that -- we care about how you're getting to that final state, your logic, and your code.

This repository has 2 years worth of Lending Club loan files stored in the data/ directory (gzipped csvs). These files are quarterly, and have data on loans that Lending Club has issued (date, amount, term, interest rate), metadata about the customer who took them out (such as employment, annual income, FICO), and the loan status. There is a data dictionary stored in the docs/ directory.

Model Usage: Your goal is to inform investors on the best loans to invest in. This means: I am going to Lending Club and ready to invest $100. There is a list of loans on their site (which have not yet been funded) that I get to choose from, and I want to know which ones are the best to invest in. Keep that goal in mind as you build your feature set and final solution.

Have fun!