Unconsidered harvesting granularity
Opened this issue · 0 comments
Dataverse does not use the good harvesting granularity supported by the repository while harvesting using OAI PHM protocol.
It always use the finest harvesting granularity YYYY-MM-DDThh:mm:ssZ
.
According to the specification (https://www.openarchives.org/OAI/openarchivesprotocol.html#Dates) :
The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Both arguments must have the same granularity. All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response. A request by a harvester with finer granularity than that supported by a repository must produce an error.
Examples :
https://dataverse.ird.fr/oai?verb=Identify <granularity>YYYY-MM-DDThh:mm:ssZ</granularity>
https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z OK
https://dataverse.ird.fr/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK
https://api.nakala.fr/oai2?verb=Identify <granularity>YYYY-MM-DD</granularity>
https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07 OK
https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z Error Code badArgument
Which is legitimate! But this is the request sent by Dataverse.
What should be done ?
Request the Identify verb to get the correct granularity before sending a request with from
or until
arguments.
Also YYYY-MM-DD
should be the default.