/ChemEngKG_kgtool

ython package for accessing the Chemical Engineering Knowledge Graph (ChemEngKG)

Primary LanguagePythonMIT LicenseMIT

KG-tool: Python package for accessing the Chemical Engineering Knowledge Graph (ChemEngKG)

Build status Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License Coverage Report

Quick Start

Here!

๐Ÿš€ Features

Development features

Deployment features

Open source community features

Installation and Documentation

1. Install package

pip install git+https://github.com/process-intelligence-research/ChemEngKG_kgtool

Then you can run

kgtool --help

or with Poetry:

poetry run kgtool --help

2. Import package and setup interface

Import the package to your python project with

from kgtool.interface import *

and setup a connection to the ChemEngKG with

chemkg = ChemKG(url="api_url", graph="graph_name")

where api_url is the url of the ChemEngKG GraphQL-API and graph_name is the name of the default graph you want to access. You can also use

chemkg_dev = ChemKG.dev()

for development purposes. This will connect to a local instance of the ChemEngKG, thus make sure to have the ChemEngKG running locally (see ChemEngKG_backend repository for further guidance).

3. Documentation

Return format

The interface provides a set of funcitonalities which can be accessed via the ChemKG class. Since the interface always interacts with the ChemEngKG via GraphQL the resonse is always a dictionary of the following form:

{
  "data": {
    "function_name": {
      ...
    }
  }
}

or

{
  "errors": [
    ...
  ]
}

if an error occured in the backend.

FILL IN EXAMPLE

Functionalities

The following list gives an overview of the available functionalities:

getGraphs

chemkg.getGraphs()

Retrieve the URI of all available graphs in the ChemEngKG.

Returns:

{"data": {
  "getGraphs": {
    "graphs":[...]
    }
  }
}

The graphs field contains a list of strings which are the URIs of the available graphs.

Note: URI is the unique resource identifier of a graph. It is not the same as the graph name which is used to define the interfaces graph (chemkg.graph).

getGraph

chemkg.getGraph()

Retrieve the contents of a the graph defined in chemkg.graph as turtle string.

Returns:

{"data": {
  "getGraph": {
    "contents": ...
    }
  }
}

The contents field contains a string which is the turtle representation of the graph in chemkg.graph.

runSparql

chemkg.runSparql(query)

Runs a SPARQL query on the graph defined in chemkg.graph and returns the response.

Inputs:

  • query: SPARQL query string

Returns:

{"data": {
  "runSparql": {
    "response":
    }
  }
}

The response field contains the response of the SPARQL query. Since it can look very different depending on the query here is a small example:

runSparql Example

For a query like

```sparql
SELECT ?s ?p ?o
WHERE {
  ?s ?p ?o
}
```

the respnse would look like this:
```json
{"data": {
  "runSparql": {
    "response": {
      "head": {
        "link": [],
        "vars": ["s", "p", "o"]
      },
      "results": {
        "distinct": False,
        "ordered": True,
        "bindings": [{
          "s": {
            "type": "uri",
            "value": "http://www.openlinksw.com/virtrdf-data-formats#default-iid"
          },
          "p": {
            "type": "uri", 
            "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
          },
          "o": {
            "type": "uri",
            "value": "http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat"
          }
        }]
      }
    }
  }
}}
```

uploadFile

chemkg.uploadFile(file_path, URI)

Uploads a file to the ChemEngKG. The file is attached to the object defined by URI.

Inputs:

  • file_path: path to the file to be uploaded
  • URI: URI of the object in the graph to which the file should be attached.

Returns:

{"data": {
  "uploadFile": {
    "fileName": ...,
    "subjectURI": ...,
    "predicate": ...,
    "fileURI": ...,
    "hashURI": ...
  }
}}
  • fileName: name the uploaded file is stored under in the ChemEngKG filestorage. You can use this name to download the file sending a request to <filestorageURL>/uploads/<fileName>.
  • subjectURI: URI of the object in the graph to which the file is attached.
  • predicate: predicate of the triple which connects the object to the file. This should either be frbr:exemplar or frbr:part.
  • fileURI: URI of the file node in the graph.
  • hashURI: URI of the hash node in the graph.

deleteFile

chemkg.deleteFile(fileURI)

Deletes a file from the ChemEngKG filestorage and removes all afiliated triples in the graph. The file is identified by its URI.

Inputs:

  • fileURI: URI of the file to be deleted

Returns:

{"data": {
  "deleteFile": {
    "response": ...
    }
  }
}

uploadTurtle

chemkg.uploadTurtle(turtle: str)

Uploads a turtle file to the ChemEngKG.

Inputs:

  • turtle: file path to the turtle file to be uploaded

Returns:

{"data": {
  "uploadTurtle": {
    "response": ...
    }
  }
}

Makefile usage (for Maintainer)

Makefile contains a lot of functions for faster development.

1. Download and remove Poetry

To download and install Poetry run:

make poetry-download

To uninstall

make poetry-remove

2. Install all dependencies and pre-commit hooks

Install requirements:

make install

Pre-commit hooks coulb be installed after git init via

make pre-commit-install

3. Codestyle

Automatic formatting uses pyupgrade, isort and black.

make codestyle

# or use synonym
make formatting

Codestyle checks only, without rewriting files:

make check-codestyle

Note: check-codestyle uses isort, black and darglint library

Update all dev libraries to the latest version using one comand

make update-dev-deps
4. Code security

make check-safety

This command launches Poetry integrity checks as well as identifies security issues with Safety and Bandit.

make check-safety

5. Type checks

Run mypy static type checker

make mypy

6. Tests with coverage badges

Run pytest

make test

7. All linters

Of course there is a command to rule run all linters in one:

make lint

the same as:

make test && make check-codestyle && make mypy && make check-safety

8. Docker

make docker-build

which is equivalent to:

make docker-build VERSION=latest

Remove docker image with

make docker-remove

More information about docker.

9. Cleanup

Delete pycache files

make pycache-remove

Remove package build

make build-remove

Delete .DS_STORE files

make dsstore-remove

Remove .mypycache

make mypycache-remove

Or to remove all above run:

make cleanup

๐Ÿ“ˆ Releases

You can see the list of available releases on the GitHub Releases page.

We follow Semantic Versions specification.

We use Release Drafter. As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when youโ€™re ready. With the categories option, you can categorize pull requests in release notes using labels.

List of labels and corresponding titles

Label Title in Releases
enhancement, feature ๐Ÿš€ Features
bug, refactoring, bugfix, fix ๐Ÿ”ง Fixes & Refactoring
build, ci, testing ๐Ÿ“ฆ Build System & CI/CD
breaking ๐Ÿ’ฅ Breaking Changes
documentation ๐Ÿ“ Documentation
dependencies โฌ†๏ธ Dependencies updates

You can update it in release-drafter.yml.

GitHub creates the bug, enhancement, and documentation labels for you. Dependabot creates the dependencies label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them.

๐Ÿ›ก License

LICENSE

This project is licensed under the terms of the MIT license. See LICENSE for more details.

๐Ÿ“ƒ Citation

@misc{kgtool,
  author = {KleinpaรŸ, Marvin and Kondakov, Valentin and Gao, Qinghe and Schulze Balhorn, Lukas and Schweidtmann, Artur M.},
  doi = {10.4121/83abb8a2-73f8-4f45-994c-a904268757ae.v1},
  title = {KG-tool: Python package for accessing the Chemical Engineering Knowledge Graph (ChemEngKG)},
  year = {2024},
  publisher = {4TU.ResearchData},
  keywords = {Artificial intelligence, chemical engineering, knowledge graph, flowsheet simulations, chemical process development},
  copyright = {MIT},
  url = {https://doi.org/10.4121/83abb8a2-73f8-4f45-994c-a904268757ae}
}