/sheet2rdf

Testing automatic workflow that fetches Google Sheet converts it to RDF

Primary LanguagePythonOtherNOASSERTION

DOI

sheet2rdf

This repository hosts automatic workflow, executed by means of Github actions, and underlying shell and python scripts which:

  • Fetches Google Sheet from Google Drive and stores is as xlsx and csv files
  • Converts fetched sheet to machine-actionable and FAIR RDF vocabulary using xls2rdf
  • Tests the resulting RDF vocabulary using qSKOS
  • Commits conversion results and tests logs to this repository
  • and deploy RDF vocabulary to OntoStack to be served to humans and machines

This workflow is an extension of excel2rdf.

OntoStack

OntoStack is a set of orchestrated micro-services configured and interfaced such that they can intake vocabularies and resolve their terms and RDF properties upon requests either by humans or machines.

Some of OntoStack micro-services are:

  • Jena Fuseki a graph database
  • SKOSMOS a web-based SKOS browser acting as a front-end for the vocabularies persisted by the graph database
  • Træfik an edge router responsible for proper serving of URL requests

Currently three instances of OntoStack are available:

Configuring sheet2rdf

In case you want to use sheet2rdf in your own work you need to:

  1. Follow gsheets Quickstart and generate client_secrets.json and storage.json

  2. Create following Github secrets:

    • DB_USER: user name of Jena Fuseki user account that has privilages to PUT RDF vocabulary to the database
    • DB_PASS: password of for the above account
    • FILE_NAME: file name that will be used when converting Google sheet to .ttl (RDF), .xlsx, and .csv files. Currently set to vocabulary.
    • GRAPH: graph in the database under which the above RDF vocabulary should be stored.
    • SHEET_ID: unique ID of the sheet that will be fetched from Google drive.
    • SPARQL_ENDPOINT: endpoint to which RDF vocabulary is PUT.
    • STORAGE: content of storage.json
    • CLIENT: content of client.json

License

This work is licensed under Apache 2.0 License

Citation

In case you are using this workflow the author kindly requests you to cite this repository in your publications such as:

Nikola Vasiljevic. (2021, January 11). sheet2rdf: First release (Version v0.1). Zenodo. http://doi.org/10.5281/zenodo.4432136

For any other citation format visit http://doi.org/10.5281/zenodo.4432136