alan-turing-institute/repro-catalogue

A tool to catalogue versions of data, code and results to check the reproducibility of your research project

PythonMIT

repro-catalogue

A command line tool to catalogue versions of data, code and results to support reproducibility of research projects.

Contents

Introduction
Installation
Contributing
Contributors

Introduction

Research projects are frequently updated - new data are added, and the code undergoes regular changes. Under these circumstances, it's easy to store results, yet lose track of the context in which they were produced.

The catalogue tool aids reproducibility by saving hash values of the input data and the results, along with the git commit hash of the code used to generate those results. The catalogue command line interface then allows the user to easily compare the hash values from different occasions on which the analysis was run so that changes to the input data, code and results can be identified and the impact on reproducibility can be understood.

Installation

The package is available on PyPI and requires Python 3 to use:

pip install repro-catalogue

See https://repro-catalogue.readthedocs.io for full documentation on how to install and use the tool.

Contributing

🚧 This repository is always a work in progress and everyone is encouraged to help us build something that is useful to the many. 🚧

Everyone is asked to follow our code of conduct and to checkout our contributing guidelines for more information on how to get started.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_{Louise Bowler}
📖 🤔 👀 📆 🎨 🚧 🚇 💻

_Isla
🎨 🤔 💻 📖

_{Kirstie Whitaker}
🎨 🤔 🚇

_{Sarah Gibson}
💻 👀

_kevinxufs
👀 📓 📖 💻 🎨 🤔

_{Eric Daub}
🎨 🤔 💻 📖 👀 🚧 📆

_{Radka Jersakova}
🎨 🤔 💻 👀 📖 🚧 📆 🚇

This project follows the all-contributors specification. Contributions of any kind welcome!