magda-io/magda-csw-connector

Dataset should have publisher

Opened this issue · 2 comments

Describe the bug
Some datasets harvested by this connector have empty publisher property. E.g. sources:

  • Australian Urban Research Infrastructure Network
  • Australian Oceans Data Network
  • Mineral Resources Tasmania

To Reproduce
E.g. the following request will have publisher = "".
https://data.gov.au/api/v0/registry/records/ds-aurin-aurin:datasource-AU_Govt_ABS-UoM_AURIN_DB_3_abs_ihad_lga_2016?optionalAspect=source&optionalAspect=dcat-dataset-strings&optionalAspect=dcat-distribution-strings&dereference=true

{
"aspects": {
  "dcat-dataset-strings": {
  "contactPoint": "GeoServer",
  "description": "...",
  "keywords": [
  "socio-economic"
  ],
  "languages": [],
  "publisher": "",
  "spatial": "POLYGON((96.81 -43.75000004080834, 159.11000000000004 -43.75000004080834, 159.11000000000004 -9.140000000990348, 96.81 -9.140000000990348, 96.81 -43.75000004080834))",
  "themes": [],
  "title": "ABS - Index of Household Advantage and Disadvantage (IHAD) (LGA) 2016"
  },
  "source": {
  "id": "aurin",
  "name": "Australian Urban Research Infrastructure Network",
  "type": "csw-dataset",
  "url": "https://openapi.aurin.org.au/public/csw?service=CSW&version=2.0.2&request=GetRecordById&elementsetname=full&outputschema=http%3A%2F%2Fwww.isotc211.org%2F2005%2Fgmd&typeNames=gmd%3AMD_Metadata&id=aurin%3Adatasource-AU_Govt_ABS-UoM_AURIN_DB_3_abs_ihad_lga_2016"
  }
},
"id": "ds-aurin-aurin:datasource-AU_Govt_ABS-UoM_AURIN_DB_3_abs_ihad_lga_2016",
"name": "ABS - Index of Household Advantage and Disadvantage (IHAD) (LGA) 2016",
"sourceTag": "60eda22a-11ff-4ae9-9def-0f12bef8f179",
"tenantId": 0
}

Besides,
If adding optionalAspect=dataset-distributions to the above query, the values of all accessURL will have double slash after hostname.

accessURL: "https://openapi.aurin.org.au//public/wfs?request=getFeature&version=1.0.0...

Expected behavior

  • Organisation search API relies on the existence of publisher property.
  • An accessURL should be correct (without extra slash).

@mwu2018
1> The original response has the double forwarded slashes:
e.g.
https://openapi.aurin.org.au//public/wfs?request=getFeature&version=1.0.0&outputFormat=shape-zip&typename=aurin:datasource-ACT_Govt_CMTEDD-UoM_AURIN_DB_seifi_irsd_10_groups_2006

image

2> The service endpoint seems doesn't provide useful publisher info.
Considering we now allow you to attach extra aspect info to every aspect, we probably should set the publisher to aurin for all datasets from this connector via connector helm config:

https://github.com/magda-io/magda/blob/57d5c8a98086fd9eebe9cd508cbc99ee3e10fec9/magda-typescript-common/src/JsonConnector.ts#L729

Should fix item 2 in next DGA deployment via config update