This repository contains files involved in the development of a model for capturing provenance information for assertions made about chemical substances. The goal is to implement this model on the BioThings suite of APIs to provide a standardized way to represent provenance information in the results of user-submitted queries.
-
mists\README.MD
Information about our analysis on selected Minimal Information Standards for reporting biomedical studies -
nanopubs\README.MD
Information about our analysis on the provenance usage in Nanopublications -
wikidata\code\WikiDataExtraction.ipynb
Python notebook for extracting data for all chemical compounds from WikiData in JSON format -
wikidata\code\wikidatajsonfileparser.py
Python script to extract the provenance information from the JSON files in Point 3. and store it in a filewikidata\data\wikidata_provenance_results.csv
-
wikidata\code\WikiDataProvenanceAnalysis.ipynb
Python notebook using the file generated in Point 4. to determine and plot the usage frequency of each provenance property -
wikidata\data\wikidata_chemicalcompounds.csv
CSV file storing a list of identifiers for each chemical compound in WikiData (used in Point 3. to extract the JSON data) -
wikidata\provenancemodel\wikidata_example.json
Example snippet of a JSON file about a chemical compound from WikiData, demonstrating the claims and references provenance structure -
wikidata\provenancemodel\wikidata_provenance.png
Schematic which shows by example the structure of WikiData's provenance model -
BioThingsProvenanceModel.xlsx
Spreadsheet with our detailed analysis and interpretation of the results, as well as our provenance proposal for BioThings -
jsonldexample.json
Example JSON-LD file implementing our provenance model for a single assertion -
rdfexample.json
Example RDF file (using Turtle syntax) implementing our provenance model for a single assertion -
biothingsprovenancemodel.jsonld
JSON-LD context file that maps shorthand versions of the properties to standard ontology terms so that the JSON data can be specified with these much simpler and shorter property names -
README.MD
This file