Create Analysis Notebook 3
Opened this issue · 1 comments
Create example code and instructions to segment single-family products with the below filters:
- single-family
- first-lien
- owner-occupied
- conventional
- home purchase
This code should be provided as a function that accepts:
- extensions to the WHERE clause (for example: geography, action type, lender)
- table name
- database name
- schema name
- host name
These filters should accept single inputs, or list-like inputs.
This function should have an option that allows the user to write the query results to a pipe-delimited file with a .txt extension.
In the instructions inside the Jupyter notebook, discuss:
- what these filters mean and how they affect the mortgage product
- why a homogenous product is important to analysis
- the presence of action type in the HMDA data and how that affects analysis
Produce the following outputs:
- flat file with a pipe-delimiter and .TXT extension
- Pandas dataframe (shown inline)
- SQL script (located in the SQL folder)
- analysis of a subset of HMDA data showing comparisons of product types in two different states over time. The comparison should use 2004-2017 data that was written to a file and reloaded. This analysis should account for action taken type and use Pandas to generate an aggregate measure of the data.
- one or more example of visualizations of the data. For example originated loan amount averages for several MSAs from 2004-2017.
The goal of this example is to demonstrate how to get a dataset of a homogenous mortgage product, save the dataset to disk, load the data to Pandas, produce aggregate metrics, and graph them in a meaningful way.
This can probably be combined with the issue for creating a function library.