The purpose of the present plug-in is to synchronize the industrial knowledge graph between the Cognite Data Fusion (CDF) and Azure Digital Twin (ADT) platforms, using Azure Functions written in Python.
For development Python 3.9.7 was used, as this was the latest version supported by Azure functions. For various ways of deployment, check this link.
The user must respect a few ground rules (guidelines) when making changes to the asset hierarchy, because of issues and limitations that cannot be handled unambiguously by the current solution.
-
Do not change the external ID of resources in CDF. Instead, delete the resource first and create it again with the new external ID.
-
Do not use the same external ID for different type of resources in CDF.
-
Do not edit the “externalId” and “id” properties of resources (assets and timeseries for now) in ADT. Even if blank (i.e., not set) leave them as is, the CDF→ADT sync will take care of it.
-
In ADT do not create different relationships with the same ID, because in CDF they are unique.
-
For timeseries in ADT, update both the "latestValue" and the "timestamp" properties at the same time. Otherwise the new datapoint will not be inserted in CDF. Also, do not insert string values into numeric timeseries, and vice-versa.
The solution translates a CDF asset hierarchy together with contextualized operational and engineering data into Digital Twin Definition Language (DTDL) ontologies, pushes the results to Azure, and synchronizes changes in the graph in both directions.
Currently, the following CDF resource types are mapped:
- Assets
- Asset-to-asset relationships
- Timeseries with the value of the latest datapoint
The project contains two main features:
- a timer-triggered Azure function to create/update the knowledge graph in the CDF→ADT direction,
- an event-triggered Azure function to update changes in the ADT→CDF direction.
The DTDL models used to represent resources in ADT are stored in the Models
folder in this repository.
For more details about the Azure functions check the CDF→ADT Readme and ADT→CDF Readme files.
CDF Assets are translated into the Asset
DTDL model together with all properties:
- CDF external ID and internal ID (remember not to edit these)
- name (which is mandatory in CDF)
- description
- metadata - represented by the
tags/values
map property in ADT
In the current solution only asset-to-asset CDF relationships are modeled, but at the same time two types of ADT relationships should be differentiated:
-
Explicit relationships: in CDF they are the actual Relationship resources and are represented by the relatesTo ADT relationship. IMPORTANT NOTE: these relationships can have multiple labels in CDF – check the limitations on how this is handled.
- Implicit relationships: in CDF they are not separate resources but are stored as properties. In ADT they must still be represented as real relationships. These are the following (2 for now):
- Parent-child relationship: stored in the “parent_external_id” field of a CDF asset, and represented in ADT by the parent relationship between Asset twins.
- Timeseries – belongs to – Asset relationship: stored in the “asset_id” field of a CDF Timeseries, and represented in ADT by the contains relationship between Asset and Timeseries twins.
To summarize, currently there are 3 types of ADT relationships (relatesTo, parent, contains), all defined in the Asset
model.
CDF Timeseries are translated into the Timeseries
DTDL model and are similar to assets with the addition of 2 new properties holding the value and the timestamp, respectively, of the latest datapoint.
In order to deploy and run the plugin, the following resources are required:
-
CDF tenant, which contains the initial industrial knowledge graph(s) to be mapped
-
Microsoft Azure tenant, where the Azure functions will be deployed to replicate and synchronize the graph(s). The Azure resources below need to be created beforehand:
- 2 function apps (one timer-triggered and one event-triggered),
- 2 blob storage accounts (one for each function),
- Key vault,
- Azure Digital Twins,
- Event Hub.
The Python libraries used during the development of the two Azure functions are listed in the table below (last update on May 20, 2022).
Python Library | Version | |
---|---|---|
CDF→ADT | ADT→CDF | |
azure-core | 1.24.0 | |
azure-digitaltwins-core | 1.1.0 | |
azure-eventhub | - | 5.9.0 |
azure-functions | 1.11.2 | |
azure-identity | 1.10.0 | |
azure-storage-blob | 12.12.0 | - |
cognite-sdk | 2.49.1 |
Besides the knowledge graph itself, all the inputs for the functions must be defined as environment variables in the Azure function configuration settings. The table below summarizes the list of keys and the requirement for each function.
Variable Key | Description | CDF→ADT | ADT→CDF |
---|---|---|---|
ADT_URL | URL of the ADT resource (with "https://") | YES | YES |
adtevents_RootManageSharedAccessKey_EVENTHUB |
endpoint of the Event Hub | NO | YES |
AzureWebJobsStorage | connection string to the blob storage linked to this Azure function | YES | YES |
CDF_CLIENT_SECRET | client secret of the Cognite tenant | YES | YES |
CDF_CLIENTID | the client ID of the Cognite tenant | YES | YES |
CDF_CLUSTER | cluster of the Cognite tenant | YES | YES |
CDF_TENANTID | ID of the Cognite tenant | YES | YES |
CDF_PROJECT | Cognite project inside the Cognite tenant | YES | YES |
FUNCTIONS_EXTENSION_VERSION | "~4" | "~4" | |
FUNCTIONS_WORKER_RUNTIME | defaults to "python" in both cases | "python" | "python" |
ROOT_ASSET_EXTERNAL_ID | the external ID of the root asset node of the knowledge graph to be instantiated and synchronized | YES | YES |
To run the Azure functions on your local computer, you may need to add additional environment variables in your local.settings.json
file. Check this documentation for more information.
Contributors names and contact info:
- Murad Sæter: murad.sater@cognitedata.com
- Janos Puskas: janos.puskas@accenture.com
- Robert-Adrian Rill: robert-adrian.rill@accenture.com
- Zsolt Tofalvi: zsolt.tofalvi@accenture.com