repro-catalogue
A command line tool to catalogue versions of data, code and results to support reproducibility of research projects.
Contents
Introduction
Research projects are frequently updated - new data are added, and the code undergoes regular changes. Under these circumstances, it's easy to store results, yet lose track of the context in which they were produced.
The catalogue
tool aids reproducibility by saving hash values of the input data and the results, along with the git commit hash of the code used to generate those results. The catalogue
command line interface then allows the user to easily compare the hash values from different occasions on which the analysis was run so that changes to the input data, code and results can be identified and the impact on reproducibility can be understood.
Installation
The package is available on PyPI and requires Python 3 to use:
pip install repro-catalogue
See https://repro-catalogue.readthedocs.io for full documentation on how to install and use the tool.
Contributing
🚧 This repository is always a work in progress and everyone is encouraged to help us build something that is useful to the many. 🚧
Everyone is asked to follow our code of conduct and to checkout our contributing guidelines for more information on how to get started.
Contributors ✨
Thanks goes to these wonderful people (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!