Lecture Enterprise Data Science 2020

Project Outline: Applied datascience on COVID-19 data

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

The goal of this lecture is to transport the best practices of data science from the industry while developing a COVID-19 analysis prototype.

The final result will be a dynamic dashboard - which can be updated by one click - of COVID-19 data with filtered and calculated data sets like the current Doubling Rate of confirmed cases

Techniques used are REST Services, Python Pandas, scikit-learn, Facebook Prophet, Plotly, Dash

For this, we will follow an industry-standard process (CRISP-DM) by focusing on the iterative nature of agile development

Business understanding (what is our goal)
Data Understanding (where do we get data and cleaning of data)
Data Preparation (data transformation and visualization)
Modeling (Statistics, Machine Learning, and SIR Simulations on COVID Data)
Deployment (how to deliver results, dynamic dashboards)

Pramod07Ch/COVID_19_Data_Science

Lecture Enterprise Data Science 2020

Project Outline: Applied datascience on COVID-19 data

Result of Analysis prototype visualization

SIR dynamic model