This repository involved understanding COVID-19 data obtained from the Center for Disease Control and Prevention (CDC). The CDC collected de-identified patient data including COVID severity indicators, outcomes, clinical data and laboratory test results. This dataset was then cleaned and the following tasks were completed:
- Data quality report of the original dataset.
- Data quality plan for the cleaned dataset.
- Explore Relationships between feature pairs.
- Transform the existing features to create new features with the aim to better capture the problem domain and the target outcome.