/biomodels-metadata-analysis

Generate a summary of metadata annotations in the BioModels database

Primary LanguagePython

BioModels Metadata Analysis DOI

BioModels is a CC0 licensed repository of mathematical models of biological and biomedical systems. It contains manually curated, non-curated, and autogenerated models of varying quality. Most contain some model-level metadata.

This repo automatically downloads, parses, and summarizes the metadata across all applicable models in BioModels. Currently, it generates 3 files:

  1. tag_summary.tsv - summarizes what prefixes are used in model-level metadata
  2. tag_prefix_summary.tsv - summarizes what prefixes are used in model-level metadata and in combination with what target prefixes
  3. triples.tsv - dump of all metadata statements as triples, normalized with the Bioregistry.

Editorial: While there might be subtle differences between the predicates in the http://biomodels.net/biology-qualifiers/ and http://biomodels.net/model-qualifiers/ namespaces, it appears that there is lots of duplicated information and not a standardized schema applied across BioModels.

Rebuild

Dependencies and analysis are automated with tox. Run the following in the command line:

pip install tox
tox

License

Code in this repository is licensed under the MIT License. Data in this repository is licensed under the CC0 License.

Acknowledgements

The development of this repository is funded by the DARPA ASKEM program, grant number HR00112220036.