POST /pub/validate method crashes with 500 error if datastore-services is down
simon-20 opened this issue · 2 comments
Brief Description
If Solr goes down, then POST requests to /pub/validate fail/crash with a 500 error, and without returning any useful information.
Severity
Medium-High
Issue Location
Steps to Reproduce
Easiest:
Run locally, with datastore-services also running locally, and point it to a non-existent Solr URL.
Expected Results/Behaviour
At a minimum it seems the validation call should succeed and produce a useful validation report.
There is a question of how to handle advisories, if the Datastore is down, and so cannot be queried.
We can't simply leave the advisory information out completely, because there are counts of advisories that are included, and tooling now expects them to be there.
If we report it as 0, that may falsely suggest there are no advisories. But that may be the best option. Introducing a flag, indicating whether advisories were fully tested for, would introduce a lot of complexity for all the tools downstream.
@robredpath, what do you think?
Actual Results/Behaviour
If the Solr is down, the Datastore API to check existence of identifiers gives 500, and this crashes the /pub/validate request.
My instinct is that 500 is probably the desired behaviour - if our systems have some kind of internal error while trying to serve a response to the request (even if it was just a part of the request that the user might not care about) then I think it's best to let the requesting system know through means of an HTTP response code.
We should try and make the response body meaningful, though. And, if this is a regular occurrence, we should look at the datastore's reliability and/or look at some kind of caching or other way to mitigate for datastore downtime internally.
Okay, let's think it through some more.
The issue with a 500 error is, even if we provide a helpful message to go along with it, it will stop everything proceeding through the unified pipeline. (Because if Solr/Datastore API is down, no validate API call will succeed).
It also means people can't use the manual upload function of the Validator to check arbitrary files.
But perhaps that is the correct behaviour--if advisories are first-class citizens in the arena of Validator messages, then if we can't generate advisories, we wouldn't produce a validation report.