Materials (code & data) for the ODSC East 2020 Virtual Conference workshop: 'Building Data Narratives: An End-to-End Machine Learning Practicum'
HTML
StoredDataStories
Materials (code & data) for the ODSC East 2020 Virtual Conference workshop: 'Building Data Narratives: An End-to-End Machine Learning Practicum'
Why data narratives?
The machine learning campaign is nearing its conclusion. Data has been collected and curated. Features have been engineered. Multiple models have been built and evaluated.
Now it's time to communicate the details, conclusions, and actionable insights that are consequences of the machine learning campaign.
There are multiple audiences: fellow data scientists who are as interested in the mechanics of the machine learning campaign as they are in the outcome; data engineers who might be tasked with putting the models into production; managers and customers whose main interest is in the outcome, and how the results might be utilized.
The purpose of a data narrative is communication, e.g., a notebook that mixes code, text, results, figures, and explanations into one seamless document; an HTML page; a document suitable for publication.
During this workshop attendees will start with tabular dataframes, build classification and regression models, document the model building process, and prepare presentations (slides) and documents (PDF) that describe the machine learning campaign. We'll demonstrate how one body of code can be used to prepare notebooks, slides, and documents. There will be an emphasis on tools and techniques that produce well-crafted tables, figures, and plots.
Attendees are encouraged to bring data around which they would like to build narratives. The scripts used in the workshop read data in tabular form, with the following format:
Identifier
Endpoint
VarName-01
VarName-02
VarName-03
...
ID-01
edpt-01
Var-01-01
Var-01-02
Var-01-03
...
ID-02
edpt-02
Var-02-01
Var-02-02
Var-02-03
...
ID-03
edpt-03
Var-03-01
Var-03-02
Var-03-03
...
...
...
...
...
...
...
Topics (change frequently, as I circle the target)
Literate programming
mixing code, text, results, figures & explanations into one seamless document / presentation
Documents
Presentations
Minimum Valuable Templates (MVT's)
interactivity
Customizing / branding templates
Building dashboards
Automated reports ...
... using RMarkdown
... using Jupyter Notebooks