/cidc-schemas

CIDC Metadata Schemas

Primary LanguagePythonMIT LicenseMIT

cidc-schemas

Branch Status Maintainability Test Coverage Code Style
master Continuous Integration Maintainability Test Coverage Code style: black

This repository contains formal definitions of the CIDC metadata model using json-schema syntax and vocabulary.

Installation

To install the latest released version, run:

pip install cidc-schemas

Development

Project Structure

  • cidc_schemas/ - a python module for generating, validating, and reading manifest and assay templates.
    • schemas/ - json specifications defining the CIDC metadata model.
      • templates/ - schemas for generating and validating manifest and assay templates.
      • assays/ - schemas for defining assay data models.
      • artifacts/ - schemas for defining artifacts.
  • docs/ - the most recent build of the data model documentation, along with templates and scripts for re-generating the documentation.
  • template_examples/ - example populated Excel files for template specifications in schemas/templates, and .csvs auto-generated from those .xlsxs that allow to transparently keep track of changes in them.
  • tests/ - tests for the cidc_schemas module.
  • .githooks/ - git hooks, e.g. for auto-generating .csvs in template_examples/ and .html documentation files.

Developer Setup

Install necessary dependencies.

pip install -r requirements.dev.txt

Install and configure pre-commit hooks.

pre-commit install

JIRA Integration

To set-up the git hook for JIRA integration, run:

ln -s ../../.githooks/commit-msg .git/hooks/commit-msg
chmod +x .git/hooks/commit-msg
rm .git/hooks/commit-msg.sample

This symbolic link is necessary to correctly link files in .githooks to .git/hooks. Note that setting the core.hooksPath configuration variable would lead to pre-commit failing. The commit-msg hook runs after the pre-commit hook, hence the two are de-coupled in this workflow.

To associate a commit with an issue, you will need to reference the JIRA Issue key (For eg 'CIDC-1111') in the corresponding commit message.

Running tests

This repository has unit tests in the tests folder. After installing dependencies the tests can be run via the command

pytest tests

Building documentation

Pre-commit hooks ensure documentation is automatically up-to date. To build the documentation manually, run the following commands:

python setup.py install # install helpers from the cidc_schemas library
python docs/generate_docs.py

This will output the generated html documents in docs/docs. If the updated docs are pushed up and merged into master, they will be viewable at https://cimac-cidc.github.io/cidc-schemas/.

Using the Command-Line Interface

This project comes with a command-line interface for validating schemas and generating/validating assay and manifest templates.

Install the CLI

Clone the repository and cd into it

git clone git@github.com:CIMAC-CIDC/cidc-schemas.git
cd cidc-schemas

Install the cidc_schemas package (this adds the cidc_schemas CLI to your console)

python setup.py install

Run cidc_schemas --help to see available options.

If you're making changes to the module and want those changes to be reflected in the CLI without reinstalling the cidc_schemas module every time, run

python3 -m cidc_schemas.cli [args]

Generate templates

Create a template for a given template configuration.

cidc_schemas generate_template -m templates/manifests/pbmc_template.json -o pbmc.xlsx

Validate filled-out templates

Check that a populated template file is valid with respect to a template specification.

cidc_schemas validate_template -m templates/manifests/pbmc_template.json -x template_examples/pbmc_template.xlsx

Validate JSON schemas

Check that a JSON schema conforms to the JSON Schema specifications.

cidc_schemas validate_schema -f shipping_core.json