Support CWL Prov with `cwltool` for OGC API - Processes IPT
Opened this issue · 0 comments
Description
Given the rising need for IPT (Integrity, Provenance, Trust) through OGC APIs and their workflow processing, the provenance capabilities of CWL should be leveraged to accomplish this goal. This would add metadata references within the CWL Application Packages themselves, allowing better open-science and IPT workflow tracking.
To Do
-
GET /jobs/{jobId}/run
to return the PROV-JSON produced by cwltool --provenance -
Alternate PROV-XML/RDF/etc. if
Accept
requests it -
When generating
cwltool --provenance
results, avoid duplicating results already found in WPS-outputs to save space (use their URI for cross-reference) -
Any additional metadata/links pointing at the specific job and process executed that should be embedded in the PROV contents
-
Cross-walk with #716 requirements
References
- https://github.com/common-workflow-language/cwlprov (see
cwltool --provenance
) - example run with OSPD algae workflow: https://gitlab.ogc.org/ogc/T20-GDC/-/wikis/GDC-Provenance-demonstration-GeoLabs
- consider new options to avoid data duplication of I/O in provenance folder
common-workflow-language/cwltool#1989 - PROV: https://www.w3.org/TR/prov-overview/
- PROV-LINKS: https://www.w3.org/TR/2013/NOTE-prov-links-20130430/
- PROV-JSON: https://www.w3.org/submissions/prov-json/
- PROV-RDF: https://www.w3.org/TR/2013/REC-prov-o-20130430/
- PROV-HTML: https://www.w3.org/TR/2013/REC-prov-n-20130430/
- PROV-XML: https://www.w3.org/TR/2013/NOTE-prov-xml-20130430/