/biothingsprovenancemodel

A proposed specification for how to capture available provenance information for chemical assertions integrated by the BioThings API suite

Primary LanguageJupyter NotebookGNU Lesser General Public License v3.0LGPL-3.0

BioThings Provenance Model

This repository contains files involved in the development of a model for capturing provenance information for assertions made about chemical substances. The goal is to implement this model on the BioThings suite of APIs to provide a standardized way to represent provenance information in the results of user-submitted queries.

File listing and information

  1. mists\README.MD
    Information about our analysis on selected Minimal Information Standards for reporting biomedical studies

  2. nanopubs\README.MD
    Information about our analysis on the provenance usage in Nanopublications

  3. wikidata\code\WikiDataExtraction.ipynb
    Python notebook for extracting data for all chemical compounds from WikiData in JSON format

  4. wikidata\code\wikidatajsonfileparser.py
    Python script to extract the provenance information from the JSON files in Point 3. and store it in a file wikidata\data\wikidata_provenance_results.csv

  5. wikidata\code\WikiDataProvenanceAnalysis.ipynb
    Python notebook using the file generated in Point 4. to determine and plot the usage frequency of each provenance property

  6. wikidata\data\wikidata_chemicalcompounds.csv
    CSV file storing a list of identifiers for each chemical compound in WikiData (used in Point 3. to extract the JSON data)

  7. wikidata\provenancemodel\wikidata_example.json
    Example snippet of a JSON file about a chemical compound from WikiData, demonstrating the claims and references provenance structure

  8. wikidata\provenancemodel\wikidata_provenance.png
    Schematic which shows by example the structure of WikiData's provenance model

  9. BioThingsProvenanceModel.xlsx
    Spreadsheet with our detailed analysis and interpretation of the results, as well as our provenance proposal for BioThings

  10. jsonldexample.json
    Example JSON-LD file implementing our provenance model for a single assertion

  11. rdfexample.json
    Example RDF file (using Turtle syntax) implementing our provenance model for a single assertion

  12. biothingsprovenancemodel.jsonld
    JSON-LD context file that maps shorthand versions of the properties to standard ontology terms so that the JSON data can be specified with these much simpler and shorter property names

  13. README.MD
    This file