QUTlib/citation-import

Improve response handling

Closed this issue · 2 comments

If a query is not well-formed, Scopus returns a HTTP Status code 400 or 500. After 4 tries, the citation import is cancelled (die Statement in get_epdata) and the remaining eprints are skipped (so they may never get updated).

Error handling should be improved:
400 --> check reason, continue
500 --> check reason, continue
429 --> Scopus quota exceeded (20000/week for Scopus search API), Scopus sends a valid XML answer --> die

Examples:
Warning: Unable to retrieve data from Scopus. The response was: 400 Bad RequestWaiting 900 seconds before trying again.
Warning: Unable to retrieve data from Scopus. The response was: 500 Internal Server ErrorWaiting 900 seconds before trying again (may be due to a malformed request)
Warning: Unable to retrieve data from Scopus. The response was: 429 UnknownWaiting 900 seconds before trying again.

That's a good call. I haven't investigated the error handling code in detail and I think it's a good candidate for improvement.

I think this is fixed now. There were definitely some issues in the code (I have vague recollections of them being broken, many years ago) which I've bashed out. Please re-open if it's not.