/datalad-dataverse

A DataLad (www.datalad.org) extension to work with Dataverse

Primary LanguagePythonOtherNOASSERTION

DataLad extension for working Dataverse

Build status codecov.io crippled-filesystems docs Documentation Status

All Contributors

Dataverse is open source research data repository software that is deployed all over the world as data or metadata repositories. It supports sharing, preserving, citing, exploring, and analyzing research data with descriptive metadata, and thus contributes greatly to open, reproducible, and FAIR science. DataLad, on the other hand, is a data management and data publication tool build on Git and git-annex. Its core data structure, DataLad datasets, can version control files of any size, and streamline data sharing, updating, and collaboration. This DataLad extension package provides interoperablity with Dataverse to support dataset transport to and from Dataverse instances.

Installation

# create and enter a new virtual environment (optional)
$ virtualenv --python=python3 ~/env/dl-dataverse
$ . ~/env/dl-dataverse/bin/activate
# install from PyPi
$ python -m pip install datalad-dataverse

How to use

Additional commands provided by this extension are immediately available after installation. However, in order to fully benefit from all improvements, the extension has to be enabled for auto-loading by executing:

git config --global --add datalad.extensions.load dataverse

Doing so will enable the extension to also alter the behavior the core DataLad package and its commands, from example to be able to directly clone from a Dataverse dataset landing page.

Summary of functionality provided by this extension

  • Interoperability between DataLad and Dataverse version 5 (or later).
  • A create-sibling-dataverse command to initialize matching Dataverse datasets for individual DataLad datasets.
  • A git-annex-remote-dataverse special remote implementation for storage and retrieval of data in Dataverse dataset via git-annex.
  • These two features combined enable the deposition and retrieveal of complete DataLad dataset on Dataverse, including version history and metadata. A direct datalad clone from a Dataverse dataset landing page is supported, and yields a fully functional DataLad dataset clone (Git repository).

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Johanna Bayer

πŸ“–

Nadine Spychala

πŸš‡ πŸ“–

Benjamin Poldrack

πŸš‡ πŸ’» πŸ“– 🚧 πŸ‘€ πŸ€” πŸ”§

Adina Wagner

πŸ’» πŸ€” πŸš‡ πŸ“– 🚧 πŸ‘€

Michael Hanke

πŸ’» πŸ€” 🚧 πŸš‡ πŸ‘€ πŸ”§

enicolaisen

πŸ“–

Roza

πŸ“–

Kelvin Sarink

πŸ’»

Jan Ernsting

πŸ’»

Chris Markiewicz

πŸ’»

Alex Waite

πŸš‡ πŸ’» 🚧 πŸ”§

Shammi270787

πŸ’»

Wu Jianxiao

πŸ’» πŸ‘€ πŸ““

Laura Waite

πŸ“–

MichaΕ‚ Szczepanik

πŸš‡

This project follows the all-contributors specification. Contributions of any kind welcome!

Acknowledgements

This DataLad extension was developed with support from the German Federal Ministry of Education and Research (BMBF 01GQ1905), the US National Science Foundation (NSF 1912266), and the Helmholtz research center JΓΌlich (RDM challenge 2022).