Bug: protocol spec violation with date ranges
Closed this issue · 0 comments
Currently, OAICompileRequest.validateDates()
checks if from
is after until
. This is a clear protocol violation!
2.7.1 Selective Harvesting and Datestamps
Harvesters may use datestamps to harvest only those records that were created, deleted, or modified within a specified date range. To specify datestamp-based selective harvesting, datestamps are included as values of the optional arguments, from and until, in the ListRecords and ListIdentifiers requests. Harvesting is restricted to the range specified by the from and until arguments, extending back to the earliest datestamp if from is omitted, and forward to the most recent datestamp if until is omitted. Range limits are inclusive: from specifies a bound that must be interpreted as "greater than or equal to", until specifies a bound that must be interpreted as "less than or equal to". Therefore, the from argument must be less than or equal to the until argument. Otherwise, a repository must issue a badArgument error.
Repositories must support selective harvesting with the from and until arguments expressed at day granularity. Optional support for seconds granularity is indicated in the response to the Identify request. The value of datestamps in both requests and responses must comply to the specifications for UTCdatetime in this document. A repository must update the datestamp of a record if a change occurs, the result of which would be a change to the metadata part of the XML-encoding of the record. Such changes include, but are not limited to, changes to the metadata of the record, changes to the metadata format of the record, introduction of a new metadata format, termination of support for a metadata format, etc.
This is especially a problem with day granularity.
It also does not check for conformance with using the same granularity for both dates: (it does by comparing the string length...)
3.3.1 UTCdatetime in Protocol Requests
Datestamps used as values of the optional arguments from and until in the ListIdentifiers and ListRecords requests are encoded using ISO8601 and are expressed in UTC. These arguments are used to specify datestamp-based selective harvesting. These arguments support the "Complete date" and the "Complete date plus hours, minutes and seconds" granularities defined in ISO8601. The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Both arguments must have the same granularity. All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response. A request by a harvester with finer granularity than that supported by a repository must produce an error.
The timestamps are also only validated if both from
and until
are present, which is incorrect (see first quote above). It's debatable if the until
should default to now()
, as no future dates are possible. At least both from
and until
should be checked to not be within the future.
Also, the request MUST use the granularity given within the Configuration
and complain about others. Plus, the configuration contains an "earliest date", which from
may not surpass
- Check
from
against earliest date from Config - Check
from
anduntil
is not in the future - Check
from
is before or equal tountil
- Check
from
anduntil
use the granularity given in Configuration - Optional: check
from
anduntil
use the same granularity (already done) - Make sure
until
isatEndOfDay()
when using "day" granularity - Default
until
to now/this day - Ensure the resumption token sourced
from
anduntil
does not circumvent this (it does now, because loading happens after validation) - Investigate into remove the
DateProvider.parse(Sring)
method to enforce using the configured granularity everywhere