/neurodatapub

NeuroDataPub is an open-source neuroimaging dataset publication tool built on top of Datalad that helps NCCR-Synapsy members in the task of managing their dataset with Datalad and referencing them to the NCCR-SYNAPSY GitHub organization.

Primary LanguagePythonApache License 2.0Apache-2.0

NeuroDataPub: NCCR-SYNAPSY Neuroimaging Dataset Publishing Tool

This tool is developed by the Connectomics Lab at the University Hospital of Lausanne (CHUV) for use within the lab and within the National Centre of Competence in Research (NCCR) "SYNAPSY – Synaptic Bases of Mental Diseases" NCCR-SYNAPSY, as well as for open-source software distribution.

PyPI DOI Documentation Status CircleCI All Contributors Codacy Badge

Overview

NeuroDataPub is an open-source neuroimaging dataset publishing tool written in Python and built on top of Datalad and git-annex. It aims to lower the barriers, for the NCCR-SYNAPSY members, to manage and publish, privately or publicly, their dataset repositories on GitHub and the annexed files on their SSH data server, in order to fully fulfill the implemented Neuroimaging Data Management Plan.

Since v0.3, you can use either (1) a server accessible via ssh or (2) the Open Science Foundation (OSF) platform, as a git-annex special remote, to host your annexed files.

Since v0.4, NeuroDataPub can handle datasets that do and do not follow the Brain Imaging Data Structure standard.

NeuroDataPub comes with its graphical user interface, aka the NeuroDataPub Assistant, created to facilitate:

  • the configuration of the siblings,

  • the creation of the JSON configuration files, as well as

  • the execution of NeuroDataPub in three different modes:

    1. creation and publication of a datalad dataset,
    2. creation of a datalad dataset only,
    3. publication of an existing datalad dataset only,
  • the creation of a Linux shell script for later execution where all commands are recorded.

NeuroDataPub is a Python 3.8 package that can be easily installed with pip as follows:

pip install neurodatapub

Documentation

Usage

NeuroDataPub has the following commandline arguments:

usage: neurodatapub [-h] --mode {all,create-only,publish-only}
                    --dataset_dir DATASET_DIR [--is_not_bids]
                    --datalad_dir DATALAD_DIR
                    --github_sibling_config GITHUB_SIBLING_CONFIG
                    (--git_annex_ssh_special_sibling_config GIT_ANNEX_SSH_SPECIAL_SIBLING_CONFIG | --osf_sibling_config OSF_SIBLING_CONFIG)
                    [--gui] [--generate_script] [-v]

Command-line argument parser of `NeuroDataPub` (v0.4)

optional arguments:
  -h, --help            show this help message and exit.
  --mode {all,create-only,publish-only}
                        Mode in which ``neurodatapub`` is run: ``"create-only"`` create
                        the datalad dataset only, ``"publish-only"`` publish the datalad
                        dataset only, ``"all"` create and publish the datalad dataset.
  --dataset_dir DATASET_DIR
                        The directory with the input dataset formatted according
                        to the BIDS standard. Specify if the directory with the input
                        dataset is not formatted according to the BIDS standard.
  --is_not_bids         Specify if the directory with the input dataset is not formatted
                        according to the BIDS standard.
  --datalad_dir DATALAD_DIR
                        The local directory where the Datalad dataset should be.
  --github_sibling_config GITHUB_SIBLING_CONFIG
                        Path to a JSON file containing configuration parameters for
                        the GitHub dataset repository sibling.
  --git_annex_ssh_special_sibling_config GIT_ANNEX_SSH_SPECIAL_SIBLING_CONFIG
                        Path to a JSON file containing configuration parameters for
                        the git-annex SSH special remote dataset sibling.
  --osf_sibling_config OSF_SIBLING_CONFIG
                        Path to a JSON file containing configuration parameters for
                        the git-annex OSF special remote dataset sibling.
  --gui                 Run NeuroDataPub in GUI mode.
  --generate_script     Dry run that generates a bash script called
                        `neurodatapub_DD-MM-YYYY_hh:mm:ss.sh` in the `code/` folder
                        of the input dataset that records all commands for later execution.
  -v, --version         show program's version number and exit.

Acknowledgment

If your are using NeuroDataPub in your work, please acknowledge this software and its dependencies:

  • Tourbier S, Hagmann P., (2021). NCCR-SYNAPSY/neurodatapub: NCCR-SYNAPSY Neuroimaging Dataset Publishing Tool (Version 0.1). Zenodo.

  • Halchenko et al., (2021). DataLad: distributed system for joint management of code, data, and their relationship. Journal of Open Source Software, 6(63), 3262, https://doi.org/10.21105/joss.03262.

License information

This software is distributed under the open-source Apache 2.0 license. See license for more details.

All trademarks referenced herein are property of their respective holders.

Help/Questions

If you run into any problems or have any code bugs or questions, please create a new GitHub Issue.

Funding

Supported by the National Centre of Competence in Research (NCCR) "SYNAPSY – Synaptic Bases of Mental Diseases" (NCCR-SYNAPSY website / NCCR-SYNAPSY Swiss National Science Foundation page) supported by SNF-185897 grant.

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Sébastien Tourbier

💻 📖 🎨 🤔 🚇 🚧 🧑‍🏫 📆 💬 👀

Patric Hagmann

🔍

This project follows the all-contributors specification. Contributions of any kind welcome!