nfdi4cat/voc4cat-tool

dcterms:provenance - Correctly used?

dalito opened this issue · 2 comments

This issue is inherited from vocexcel.

An example from VocExcel that I believe is not correct (from eg-valid.ttl).

cs: a skos:ConceptScheme ;
    dcterms:created "2021-06-07"^^xsd:date ;
    dcterms:creator <https://linked.data.gov.au/org/ga> ;
    dcterms:modified "2021-07-13"^^xsd:date ;
    dcterms:provenance "FGDM database"@en ;
    ...

Here a string literal is used but I doubt that it is allowed by dublin core (although the vocpub profile allows it).

Instead with dct:provenance labels should be of dct:ProvenanceStatement (if I understand correctly), like so:

@prefix dct: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix nmah: <http://example.org/nmah/> .
@prefix ex: <http://example.org/> .

ex:digital_image
  a dct:Image ;
  dct:provenance [
    a dct:ProvenanceStatement ;
    dct:source nmah:original_photograph ;
    dct:created "2022-01-15T00:00:00Z"^^xsd:dateTime ;
    dct:modified "2022-02-01T00:00:00Z"^^xsd:dateTime ;
    dct:description "Image was processed and metadata was added by Jane Smith, Curator of Photography at the National Museum of American History."@en
  ] .

nmah:original_photograph
  a dct:PhysicalObject ;
  dct:title "Original Photograph held in the collection of the National Museum of American History, Smithsonian Institution."@en .

What is specified exactly in dublin core?

dct:provenance:

  • Type of term: Property
  • Definition: A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.
  • Comment: The statement may include a description of any changes successive custodians made to the resource.
  • Range Includes: http://purl.org/dc/terms/ProvenanceStatement

dct:ProvenanceStatement:

  • Type of term: Class
  • Definition: Any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation.
dalito commented

After merging #131 provenance fields for concepts and collections need to have a little more structure to pass validation:

actor1 provenance-statement1, actor2 provenance-statement2, ...

actor can be a GitHub name, an ORCID-number or an ORCID-url. The space separates the actor from the note (spaces are forbidden in GitHub names and in URLs). New provenance "records" should be appended at the end.

@nmoust This is not a real fix to this issue but it makes the provenance field more usable now.

New versions of the vocpub profile (starting with 3.1) solve this issue by using skos:historyNote to store the provenance note as literal. Here we use a pre-3.0-version of the vocpub profile for all voc4cat-tool releases <0.8.0.