/data_pipeline_api

API to access the data pipeline

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

data_pipeline_api

⚠️ The code and data in this repository are currently for testing and development purposes only.

Build Status Code Coverage Anaconda-Server Badge

Summary

Python API to access files from the SCRC data pipeline.

Features

  • Loads files into memory for use in models.
  • Ensures that data used in models can be traced to its source.
  • Records model outputs such that the versions of code and data are recorded.

Contributing

See contributing.

Installing

From Repository

This package can be installed from the repository using setup.py

git clone https://github.com/ScottishCovidResponse/data_pipeline_api.git
cd data_pipeline_api
pip install -e .

or

pip install git+https://github.com/ScottishCovidResponse/data_pipeline_api.git

Pip

You can install this package via pip with

pip install data-pipeline-api
git clone https://github.com/ScottishCovidResponse/data_pipeline_api.git
cd data_pipeline_api
git tag --sort=v:refname

Data Registry Interactions

See registry README.

Releasing a new version

New versions can be released from the master branch at any time. Whenever a git tag that points to a commit in master is pushed, that version will be automatically released.

This is an example on how to release the tip of master:

git clone git@github.com:ScottishCovidResponse/data_pipeline_api.git
cd data_pipeline_api
# we are now seeing the HEAD of master
git tag -m"short description of this version" 1.0.0
git push --tags

That will release the version 1.0.0 of the data pipeline api

Reproducible Builds

ToDo

Tests

After activating your conda environment, execute the following command:

pytest --cov=data_pipeline_api tests

Usage

ToDo

Static analysis

Automated static analysis results are available - these should be interpreted with caution and the importance of each issue must be assessed individually. The setup is to use pylint with a configuration file. This is the default plus we ignore C0103 (variable names) and C0301 (line lengths). We do not make use of the overall "quality standards" features of codacy at this time as they are pretty arbitrary.

License

BSD 3-Clause License.