CSIRO-enviro-informatics/loci.cat

Embed the version information in the respective Loc-I datasets so users can easily find out which version it is and where to get more info

Opened this issue · 4 comments

How do we describe version information in each of the Loc-I datasets (e.g. ASGS, Geofabric, GNAF)?

  • each Loc-I dataset should be described with version info consistently in the metadata

2nd part - implement for each Loc-I enabled dataset.

Add details to this issue ticket. Will need to document this somewhere for consistent communication to users

I assume this will be an aspect of the dataset metadata - see CSIRO-enviro-informatics/asgs-dataset#8 CSIRO-enviro-informatics/geofabric-dataset#14 CSIRO-enviro-informatics/gnaf-dataset#2

In that context, there are a few ways to indicate version information:

  1. explicitly through a comprehensive provenance statement, with a date-time stamp - prov:wasGeneratedBy/prov:endedAtTime
  2. date-time stamp - dct:modified - time-stamp will be needed if there is more than one update per day
  3. version number - pav:version

where pav: is http://purl.org/pav/

My general assumption would be that

  • a date-time stamp will be very specific, can be easily generated, and captured in the dct:modified element. This should be automatically updated when the ETL process is run
  • alongside this, the link to the source dataset, the details of the ETL process with any run-specific parameters, and the time the process completed must all be recorded in the provenance information (prov:wasGeneratedBy and prov:wasDerivedFrom)

The link to the source data should be to a specific version.

@shaneseaton @ashleysommer @benjaminleighton Could I see an example of what the run-time parameters are, so that I can suggest how these could be recorded in a provenance record?

@benjaminleighton wrote on Slack:

On minimalist provenance for #16 I think getting this completely right first time is going to be tricky. Would sticking a pav:version in that we manually increment be sufficient for now?