/FDP-Reference-Implementation-Configuration

A record of the process for configuring the Reference Implementation of the FAIR Data Point (the one that is included in FAIR-in-a-box) for use on the EJP-RD Virtual Platform

Creative Commons Zero v1.0 UniversalCC0-1.0

FDP Reference Implementation Configuration

NOTA BENE! THIS IS ONLY FOR FAIR-in-a-Box and/or the FAIR Data Point reference implementation

If you are running any other kinds of FDP, you are in the wrong place ;-)

This is a record of the process for configuring the Reference Implementation of the FAIR Data Point (the one that is included in FAIR-in-a-box) for use on the EJP-RD Virtual Platform

Tutorial for running FAIR in a Box (FiaB)

If you don't yet have a FAIR Data Point, you could consider installing one using the FAIR in a Box (FiaB) installer here: https://github.com/ejp-rd-vp/FiaB

Now continue with the configuration below...

Setting the default license

NOTE: The EJP-RD "non-license" can be set as the default license: https://w3id.org/ejp-rd/resources/licenses/v1.0 and various other defaults can be configured to more useful values.

To do this:

  • Go to your FiaB folder.
  • Go into the ./fdp subfolder
  • edit the "application-xxxx.yml" file (xxx is the prefix name you used when you ran the FiaB installer)
  • add the following lines to the bottom of that file:
metadataProperties:
    language: http://id.loc.gov/vocabulary/iso639-1/en
    license: https://w3id.org/ejp-rd/resources/licenses/v1.0.txt
    accessRightsDescription: Contact the owner/curator of this resource to determine your access rights

ping:
    interval: PT120H
  • now shut-down and restart your FiaB using docker-compose

Navigating the FDP configuration menu

  • first, you need to be logged-in (default user: albert.einstein@example.com, pass: password)
  • The drop-down menu (top right) looks like this:

The two menu entries we are interested in are:

  • Resources Definitions: What kinds of things can be in your FDP? The defaults are:

  • Metadata schemas: What is the "shape" of the things, and what web form elements should be used to capture those? The defaults are:

Editing the Schemas

We need to edit three schemas - Resource, Dataset and Data Service - to bring them into compliance with the Virtual Platform. To begin editing a schema, click on it.

Start with Resource

The part we need to change is at the bottom - the SHACL definitions of the shape and constraints.

The correct SHACL for a Resource is found here: resource.shacl

Copy/paste that into the Web form, and then Save and Release. You will be asked to assign a version number. This is arbitrary, but the version has to be higher than the previous version (1.0.0).

Now edit Data Service

Many portions of that need to be edited. Follow my suggestions in the image above.

The SHACL also needs to be edited. the correct SHACL for a Data Service is found here: data-service.shacl

Save and give it a version.

Now edit Dataset

You only need to edit the SHACL portion of Dataset. The corrected SHACL is found here dataset.shacl

Save and give it a version.

Create a new schema for Patient registry

Back in the Metadata Schema page, select the "create new" and follow the guidelines in the image below:

the shacl file is here: patient-registry.shacl

Create a new schema for Guideline

Back in the Metadata Schema page, select the "create new" and follow the guidelines in the image below:

the shacl file is here: guideline.shacl

Create a new schema for Biobank

Back in the Metadata Schema page, select the "create new" and follow the guidelines in the image below:

the shacl file is here: biobank.shacl


Switch to Main Manu - Resource Definitions

Create a new Resource Definition for Patient Registry

Back in the main menu, select "Resource Definitions", then click the "Create Resource Definition" button:

Follow the guidelines in the image below to fill the fields, then save:

Create a new Resource Definition for Biobank

Back in the main menu, select "Resource Definitions", then click the "Create Resource Definition" button:

Follow the guidelines in the image below to fill the fields, then save:

Create a new Resource Definition for Guideline

Back in the main menu, select "Resource Definitions", then click the "Create Resource Definition" button:

Follow the guidelines in the image below to fill the fields, then save:

http://www.w3.org/ns/dcat#landingPage

Create TWO new Resource Definition for Data Services

NOTE: In EJP there are two "kinds" of Data Service - services that serve a dataset, and services that do algorithmic operations or plotting, but do not access a registry or biobank. We are going to call these "DataService" and "DataService2"

Click the "Create Resource Definition" button again.

Follow the guidelines in the image below to fill the fields, then save:

http://www.w3.org/ns/dcat#endpointURL
http://www.w3.org/ns/dcat#endpointDescription
http://www.w3.org/ns/dcat#landingPage

Do the same thing again, but this time, call it DataService2. (The two kinds of services differ only in their title and URL prefix.)

THE NORMS ARE: For services that serve a dataset (DataService), they:

  • MUST be a "child of" a distribution of dataset
  • MUST have an endpointURL (the URL of the interface)
  • MUST have an endpointDescription (the URL leading to e.g. a Swagger/openAPI document)
  • MAY have a landingPage

For services that execute algorithmic operations (DataService2), they:

  • MUST be a "child of" "Catalog"
  • MUST have a landingPage (the URL of the homepage, where you can do the operation)
  • MAY have endpointURL and endpointDescription

THESE NORMS ARE NOT ENFORCED BY THE FDP!! So... just be good citizens!

Connecting things together

Go back to the Resource Definitions (main menu)

We need to make Patient Registry a child of Catalog

This is what "Catalog" looks like at the beginning:

The red arrow is where we create a new child - in this case, we are going to make Patient Registry a child of Catalog. Note that the property that the Metadata team decided to use is dc:hasPart

We need to make Guideline a child of Catalog

In this case, we are going to make Guideline a child of Catalog. Note that the property that the Metadata team decided to use is dc:hasPart

http://purl.org/dc/terms/hasPart

We need to make Biobank a child of Catalog

In this case, we are going to make Biobank a child of Catalog. Note that the property that the Metadata team decided to use is dc:hasPart

http://purl.org/dc/terms/hasPart

We need to make DataService2 a child of Catalog

Same idea, in this case, DCAT has defined a different predicate: dcat:service

http://www.w3.org/ns/dcat#service

We need to make DataService a child of Distribution

Same process as above, but now open the definition for "Distribution". DCAT has defined the predicate: dcat:accessService to connect a Distribution with its Data Service.

http://www.w3.org/ns/dcat#accessService

Finally, we need to break the connection beteween FDP and DataService

Same process as above, but now open the definition for "FAIR Data Point.

Remove the inheritance from DataService.

Additional Configuration

On a production server, you need to activate the "ping" to register yourself with the virtual platform.

This has two steps:

  • Go back to the "./fdp/application-xxx.yml" file and edit the clientURL to be your permanent identifier.

e.g. clientUrl: https://w3id.org/duchenne-fdp

This is the URL that is sent during the "ping" to the VP Index. IT CANNOT BE LOCALHOST!

  • Add the VP Index as one of the "ping" locations (see diagram below)

Finally, that last field should now read "PT120H", because this is the ping/interval that we set in the very beginning (in the ./fdp/application.yml file). This is set to ping the VP Index every 5 days. If your FDP does not ping at least every week, the Index will consider it "inactive" and it will not appear in the main index... so... 5 days should be good!

YOU ARE NOW DONE!








Here's an example of how to enter a Data Service

you do not have to do this! It is only for your information

We will Create a new record - a Data Service that does visualization (Box-whisker plot)

Now we will create a record for a Data Service to make sure everything is working as-expected. Data Service is a child of catalog, so we need to first create a Catalog. (go ahead and do this now!). Once created, that catalog has three kinds of "children", indicated by the tabs: Datasets, Data Services, and Patient Registries.

Click on the "Data Services" tab, then "+Create"

I'm creating a hypothetical Box-Whisker plot tool. The theme is the concept of a box-whisker plot (from EDAM: edam:operation_2943). Note that, because this tool is NOT serving a dataset, I am only require to include a landing page. (endpointURL and endpointDescription are allowed to be empty)

Further down the page there are additional fields that I would like to fill-out. For example, I am going to declare that this service does not utilize personal information (in the GDPR sense). The allowed values are "true" and "false" (lower case!). I am not sure if this is enforced or not

I also want to tell the VP that the service exists, so I need to make it https://w3id.org/ejp-rd/vocabulary#VPDiscoverable. Find the section called "Vp Connection" and click "add". The result is a dropdown menu, where you are allowed to select:

VPDiscoverable

VPDiscoverable is required to get a Resource (i.e. any child of dcat:Resource) into the FDP Index.

Save!

You will see that it is in draft form. The button on the right leads to the tool. It is advertising itself as being VP Discoverable. Now all you need to do is click "Publish" (also on the Catalog)...