/21012020_DSNE

GNU General Public License v3.0GPL-3.0

Best Practice in Reproducible Data Science

Best Practice in Reproducible Data Science

Slide deck as PDF.

Links to Resources

  • FAIR Principles: The FAIR principles are a set of community-developed guidelines to ensure that data or any digital object are Findable, Accessible, Interoperable and Reusable. The FAIR principles specifically emphasize enhancing the ability of machines to automatically find and use data or any digital object, and support its reuse by individuals. Standards for the description, interoperability, citation etc. are at the core of these principles.
  • Software Sustainability Institute - excellent resources and blog posts centred around open science and software sustainability.
  • Carpentries - free training resources in version control, command-line scripting and other topics relating to reproducible research.
  • The Turing Way - A handbook for reproducible data science

Exercise Sheet and Dataset

Within the Quiz code, there are several threats to reproducibility. Quiz (PDF) - Quiz (Source)

Urban Observatory PER_EMOTE_2204

Tutorial

Reproducible data science techniques in actuarial work, work with Philip Darke, offers an end-to-end tutorial going from data to reproducible Rmarkdown report, using ProjectTemplate.

Questions, comments and suggestions

I would love to hear from you if you have any questions or comments. Please do not hesitate to contact me via email at matthew.forshaw@ncl.ac.uk or on Twitter.