Evidence string validator.
This tool is intended to validate JSON files that have a single JSON object per line. This is the format that is required from the data sources that provide us with evidence for our target-disease associations.
The validator will check the expected structure, defined in a JSON schema which must be provided via a --schema
argument.
Be aware that this is not a general-purpose JSON validator, and use of "pretty-printed" JSON will cause errors.
The Open Targets JSON schema is located at https://github.com/opentargets/json_schema. Note that you should not use master
as this may change any time, instead use the latest available tag, e.g. 1.6.3
. If you are a data provider, you will always receive an email from Open Targets with information about what JSON schema version to use. Also, when specifying the schema to the validator you have to use the "raw" GitHub URL:
https://raw.githubusercontent.com/opentargets/json_schema/1.6.3/opentargets.json
The easiest way is with pip:
pip install -U opentargets-validator
It supports both Python 2 and Python 3.
You have two options:
- pass a filename or URL as a positional argument
- read from stdin (e.g. a shell pipe)
cat file.json | opentargets_validator --schema https://raw.githubusercontent.com/opentargets/json_schema/{tag_version}/opentargets.json
This can automatically decompress gzip'ed files. Compression will be detected via filename e.g. ending with .json.gz
.
Examples of acceptable paths are:
- https://file/location/name.json
- https://file/location/name.json.gz
- file://relative/local/file.json
- file:///absolute/file.json
- location/file.json
opentargets_validator --schema https://raw.githubusercontent.com/opentargets/json_schema/{tag_version}/opentargets.json https://where/myfile/is/located.json
There used to be a --log-lines
argument that could be used to exit early when a certain number of errors occored. This is no longer supported, and with parallelization improvements it is rarely necessary in practice.
Evidence lines are checked for uniqueness by calculating the hash of the unique_association_fields
field. This can be done in the validator using the --hash
argument.
Within a virtualenv you can install with:
pip install -e .[dev]
and you can run the tests with:
pytest --cov=opentargets_validator --cov-report term tests/ --fulltrace
This repository has Travis integration and CodeCov integration .
Releases are put on PyPI automatically via Travis from GitHub tags.