What is here

This github repo is intended to demonstrate the value of:

  • jupyter notebooks,
  • collaborative, cloud-based, version control repositories such as GitHub,
  • browser-based virtual environments such as binder, and
  • Digital Object Identifier (DOI) created through means such as GitHub and Zenodo

as useful tools when making code, data and other research materials easy for others to understand, inspect, cite and develop.

Why I created it

This repo and the associated notebooks, materials and more were created to support a presentation at the UKCOTS conference held in Manchester, UK on 13-14 June, 2024.

That presentation was titled Reproducibile statistics: jupyter notebooks, github and DOI and the abstract was:

In school, we are all told to show our work. But as researchers and academics, how often do we really show our work? Well, in the modern age, we can not only show our work but also make it VERY easy for others to understand it, inspect it, cite it, and further develop it! This workshop demonstrates how jupyter notebooks, github (or other version control repositories) and DOI can be used to make your statistics and data work fully reproducible and citeable!

However, I expect that this repo and its materials will also be useful more generally as small, quick-to-load demonstrations of the component parts. For example, it is quite useful to show people who work in teaching how easy it can be to make virtual environments while those who work collaboratively may want an easy way to see how version control logs show edit history.

What you can do with it

You can clone a git repository to create a local copy on your computer. Then, you can launch jupyter notebooks to inspect the data, inspect and/or run the code, and see what I did in the entire sequence of steps. You can also edit the data and/or code as you like with your changes kept locally. You can also try to share any changes you made back to the original. If you are invited as a contributor to this repo, then you can decide what changes to keep or discard. As an interested member of the public, you can suggest changes and one or more of the invited contributors will decided to keep or discard them.

You can cite this via the DOI I created DOI.

You can share the repo or the DOI to others so that they can see how it works too.

Since I built a virtual environment for this repo in binder, all you have to do is push this button! Fill in the fields to see the markdown badge snippet. Binder It should launch a virtual environment in a new tab in your browser. This will allow you to open the jupyter notebook (the file that ends in .ipynb) and run the code cells inside it without needing to install python or jupyter notebook! How exciting is that?!?

The data

The data is the Super Heroes Dataset from kaggle and is used under the CC0: Public Domain license.