gbif/watchdog

Dataset from Nordgen (& metadata to be updated)

Opened this issue · 8 comments

The dataset from NordGen (https://doi.org/10.15468/3nyx9k) is flagged as "orphaned" -- but remain regularly updated with a BioCASE data provider.

Participant node

https://registry.gbif.org/node/fa6bdac4-51e2-4334-940c-ba6cdf6e1257
https://www.gbif.org/participant/303
Contact points are correct.

Organization/publisher

https://registry.gbif.org/organization/b9c5f740-34d9-11de-baf5-e00d96b185ef
https://www.gbif.org/publisher/b9c5f740-34d9-11de-baf5-e00d96b185ef
Update contact points to: Kjell-Åke Lundblad kjellake.lundblad@nordgen.org
Add generic contact point: info@nordgen.org
Remove contact points: Dag Endresen and Martin Forsen
(Add more metadata... homepage, logo, etc)

BioCASE installation (looks fine):
https://registry.gbif.org/installation/604f70d4-f762-11e1-a439-00145eb45e9a
https://www.gbif.org/installation/604f70d4-f762-11e1-a439-00145eb45e9a
https://www.nordgen.org/biocase/
(Add contact point? -- Kjell-Åke Lundblad)

Dataset

https://registry.gbif.org/dataset/85a347c0-f762-11e1-a439-00145eb45e9a
https://www.gbif.org/dataset/85a347c0-f762-11e1-a439-00145eb45e9a
Dataset DOI: https://doi.org/10.15468/3nyx9k

Update contact points to: Kjell-Åke Lundblad kjellake.lundblad@nordgen.org
Remove contact points: Dag Endresen and Lars Falk
(Add more metadata... homepage, logo, etc)

BioCASE dataset endpoint:
https://www.nordgen.org/biocase/dsa_info.cgi?dsa=NGB
https://www.nordgen.org/biocase/pywrapper.cgi?dsa=NGB

How to update metadata?

Is it possible to update registry metadata for participant node, publisher organisation, datasetm etc. directly from the BioCASE endpoint?

Is it alternatively possible for Nordgen (Kjell-Åke) GBIF Norway, or GBIF Sweden to get edit access to update the respective metadata for NordGen in the registry?

Homepage: https://www.nordgen.org/en/
Logo URL: https://www.nordgen.org/wp-content/uploads/2020/03/NordGen-Logotype-RGB.svg
See also: https://www.nordgen.org/en/about/press-and-media/logo-and-graphic-design/
Language: English
Address: P.O. Box 41
City: Alnarp
Province: Scania/Skåne
Country: Sweden
Postal code: SE-230 53
Email: info@nordgen.org
Phone: +46 40 536 640
Latitude: 55.65905367460462°N
Longitude: 13.084225828623053°E

Hi Dag,

Is it possible to update registry metadata for participant node, publisher organisation, dataset etc. directly from the BioCASE endpoint?

Yes, it is. This dataset fails, however, since the mapping schema offered by BioCASe is supported by us. The schema http://digir.net/schema/conceptual/darwin/2003/1.0 is offered, but our crawler doesn't support retrieving DiGIR/DWC occurrences using BioCASe protocol.

The two supported schemas are http://www.tdwg.org/schemas/abcd/1.2 and http://www.tdwg.org/schemas/abcd/2.06. Is it possible to map the dataset to either of these?

Or, if BioCASe supports it, can a Darwin Core Archive be produced from the Darwin Core mapping? (I know it supports this with ABCD mappings.)

CC @ManonGros.

Hi Dag, Matt,

creating a DwC-Archive is only possible from an ABCD archive, which requires an ABCD mapping. I would be able to assist in creating this from the old DwC mapping, if necessary,

I wonder how the dataset got harvested in the first place until three years ago. Apparently the occurrence have been last synced 3 years ago before the datset was moved to an orphaned state...

Cheers,
Jörg

I think it must have been crawled with an earlier version of GBIF's systems, perhaps pre-2013, and was broken (not crawling) since around that time.

The crawling and ingestion history in the Registry seems to describe weekly (failed) attempts at indexing the data source? With the last successful indexing made on 2018-03-09.
https://registry.gbif.org/dataset/85a347c0-f762-11e1-a439-00145eb45e9a/ingestion-history
https://registry.gbif.org/dataset/85a347c0-f762-11e1-a439-00145eb45e9a/crawling-history?offset=125

https://api.gbif.org/v1/dataset/85a347c0-f762-11e1-a439-00145eb45e9a/process?limit=25&offset=125

That crawl attempt 1 is from the orphan dataset server, so the conversion to an orphan dataset was the first time the 2013– system saw something valid for it to crawl. All subsequent crawls have been "Not modified", as the orphan dataset archive doesn't change.

Kjell-Åke has set up a new BioCASe installation and re-created the mappings and archives.
Details in his mail below:

Von: Kjell-Åke Lundblad kjellake.lundblad@nordgen.org
Gesendet: Dienstag, 11. Mai 2021 11:10
An: Holetschek, Jörg J.Holetschek@bgbm.org
Cc: Anders Telenius anders.telenius@nrm.se; mblissett@gbif.org; Marie Grosjean mgrosjean@gbif.org; Dag Endresen dag.endresen@gmail.com
Betreff: Re: [gbif/watchdog] Dataset from Nordgen (& metadata to be updated) (#33)

Thanks Jörg,

I had already tested to click on the link to cancel, but deleting the *.proc-file solved my issue.
The archiving is working now.
Now we only got one step left, to move it all to a new server. I will inform you when we have moved.

Current "GBIF-node-address" is https://www.nordgen.org/biocase/pywrapper.cgi?dsa=genbisNGB.
The new will probably be https://biocase.nordgen.org/pywrapper.cgi?dsa=genbisSWE054.

This due to that we decided that www.nordgen.org should only be the WordPress CMS,
and because of we will also host GBIF-client-nodes for other Gene banks like Estonian Crop Research Institute (ECRI), theirs address will probably be something like:
https://biocase.nordgen.org/pywrapper.cgi?dsa=genbisEST019

Since I already have worked out a couple of data views, to be able to collect the necessary data from our Grin-Global system. I can then use those data views as templates to set up a node for other Gene banks we host in Genbis (Nordic and Baltic gene banks instance of an adapted Grin-Global system).

It looks like the archiving process is finished.

For now, you should be able to reach us on the address:
https://www.nordgen.org/biocase/pywrapper.cgi?dsa=genbisNGB

Best regards

Kjell-Åke Lundblad

We updated the endpoint, the dataset is crawling right now.

Apropos questions from Kjell-Åke on the ABCD 2.06 mapping. Before ABCD 2.06 was released Walter and Helmut made the mapping between MCPD and ABCD in this document. Maybe useful to understand the intentions?