IQSS/dataverse

Unconsidered harvesting granularity

Opened this issue · 0 comments

Dataverse does not use the good harvesting granularity supported by the repository while harvesting using OAI PHM protocol.
It always use the finest harvesting granularity YYYY-MM-DDThh:mm:ssZ.

According to the specification (https://www.openarchives.org/OAI/openarchivesprotocol.html#Dates) :

The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Both arguments must have the same granularity. All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response. A request by a harvester with finer granularity than that supported by a repository must produce an error.

Examples :

https://dataverse.ird.fr/oai?verb=Identify <granularity>YYYY-MM-DDThh:mm:ssZ</granularity>
https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z OK
https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK

https://api.nakala.fr/oai2?verb=Identify <granularity>YYYY-MM-DD</granularity>
https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK
https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z Error Code badArgument Which is legitimate! But this is the request sent by Dataverse.

What should be done ?

Request the Identify verb to get the correct granularity before sending a request with from or until arguments.
Also YYYY-MM-DD should be the default.