Integrate ExLibris Primo via REST API

Question

Integrate ExLibris Primo via REST API

johan12345 opened this issue 6 years ago · 9 comments

Basic support for ExLibris Primo (#107) was already implemented using screenscraping.

It seems that Primo also has a REST API that we can use. This becomes especially important as Primo now has a new user interface that our app is not yet compatible with, and some libraries (e.g. TU Berlin) are starting to switch off the old interface.

useful links:
https://developers.exlibrisgroup.com/primo/apis/webservices/rest/pnxs
https://tu-berlin.hosted.exlibrisgroup.com/primo_library/libweb/webservices/rest/v1/configuration/TUB
internal support ticket # 2340333

It seems that VideLibri has already implemented the REST API, so we can look there for usage examples.

johan12345 commented 4 years ago

Ah, good!

Answer 1 · 2020-11-03T15:22:27.000Z

I'm much interested in the new Primo interface as it's also used by HTW Dresden. See also https://primo.bib.htw-dresden.de/primo_library/libweb/webservices/rest/v1/configuration/49HTW_VU1.

From Exlibris' deverloper web site it seems, however, that you need an account with the Developer Network for each library in order to be able to create an API key:

If you are a developer at a Primo institution, an account has already been created for your institution. Contact your institution’s technical contact person in order to get an invitation to join the institution group. Once you have joined your institution’s group, you will be able to create and edit your institution’s applications.

It seems that VideLibri has already implemented the REST API

At least for HTW Dresden it doesn't work at all (version 2,185): I get an error when trying to log in via VideLibri ("Beim Zugriff auf das Konto ... ist leider ein Fehler aufgetreten: " and "Die Bibliothek zeigt diese Nachricht auf der Katalogwebseite an: ", but there's no information after the colons). Searching doesn't throw errors but it yields completely unrelated data, obviously from some kind of demo database. I didn't look further into it but the error message suggests that they don't use an API but instead try to scrape the web site.

Answer 2 · 2020-11-10T20:18:19.000Z

I looked into it and it turns out that the new primo interface uses a lot of javascipt to dynamically generate it web sites by calling the webservices REST API.

From calling the login page with PDS authentification we get a pds_handle. With this handle we call the logon page which results in a double redirect. From the second redirect we get a JSESSIONID and from the final url response a loginID. This is then used to query the primo_library/libweb/webservices/rest/v1/loginJwtCache API which returns the bearer token needed for subsequent API calls (thanks to HTTP-Header-Live add-on).

I couldn't find any documentation on this API on the internet, but a call to primo_library/libweb/webservices/rest/v1/myaccount yields a json of useful (all?) endpoints (example for HTW Dresden):

{'counters': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/counters',
  'method': 'GET'},
 'blocks_messages': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/blocks_messages',
  'method': 'GET'},
 'loans': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/loans',
  'method': 'GET'},
 'personal_settings': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/personal_settings',
  'method': 'GET'},
 'cancel_requests': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/cancel_requests',
  'method': 'POST'},
 'renew_all_loans': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/renew_all_loans',
  'method': 'POST'},
 'collection': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/collection',
  'method': 'GET'},
 'renew_loan': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/renew_loan',
  'method': 'POST'},
 'requests': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/requests',
  'method': 'GET'},
 'beaconO22': '2',
 'fines': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/fines',
  'method': 'GET'},
 'renew_selected_loans': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/renew_selected_loans',
  'method': 'POST'},
 'update_personal_settings': {'path': '/primo_library/libweb/webservices/rest/v1/myaccount/update_personal_settings',
  'method': 'POST'}}

I made a python prototype that works just fine for querying account information, getting search fields with translations and searching. I'd like to convert it into a new libopac API, say PrimoExplore or PrimoNew or something like that. Any ideas for a better name?

All jsons returned by the API contain a beacon field. I tried to read up on the Beacon API but I'm not sure if these beacon fields relate to this API. Is this something we should pay attention to in the libopac API?

Answer 3 · 2020-11-11T18:24:52.000Z

Wow, this sounds great!

Do the catalogue search APIs also work without previous authentication with the user credentials? I think the app has no infrastructure in place so that the API can tell it that login is required for search as well.

Any ideas for a better name?

I think PrimoExplore is fine.

All jsons returned by the API contain a beacon field. I tried to read up on the Beacon API but I'm not sure if these beacon fields relate to this API. Is this something we should pay attention to in the libopac API?

I am also not sure what this field means, I think it's not related to the Beacon API... I would say we can ignore it as long as it works without doing anything with it :)

Answer 4 · 2020-11-11T22:04:13.000Z

Do the catalogue search APIs also work without previous authentication with the user credentials?

Yes, there's a guestJWT you can obtain when not logged in (it's even faster to obtain).

Answer 5 · 2020-11-13T15:18:45.000Z

A general solution working for all libraries using Primo seems to be rather hard to achieve.

Login works for HTW Dresden, which uses PDS, but other libraries may use different authentication systems and even those who do use PDS may generate the login page by javascript in different ways (even the names of the fields in the post request seem to vary across libraries). I checked it with the following libraries in Germany, Austria and Switzerland (retrieved from their respective configuration jsons):

Library	Authentication system
Campus-Bibliothek für Informatik und Mathematik	PDS
Diözese St. Pölten	PDS
FH Campus Wien	ALMA, SAML
FH Technikum Wien	PDS
FU Berlin	SAML
Hochschule Mittweida	ALMA, SAML
HTW Dresden	PDS
HU Berlin	ALMA, SAML
JKU Linz	SAML
KIT Karlsruhe	PDS
ÖNB Wien	LDAP
ORBIS plus	PDS
TU Berlin	ALMA, SAML
TU Wien	PDS
UB Duisburg-Essen	PDS
UB Mannheim	PDS
UB MU Wien	PDS
UB Trier	PDS, SAML
UdK Berlin	ALMA, SAML
ULB Innsbruck	PDS
Uni Wien	PDS
ZB Zürich	PDS

I'm afraid we might need not even library-specific paramters and urls (which can easily be put in the json config file) but also some library-specific code. So I think for the time being HTW Dresden will be the only Primo library supporting account functions.

The same goes for availability of individual copies. From the API we get general availability information (available or not and in what branch and collection including call numbers) but we don't get information for individual copies (how many copies and their barcodes, and when they will be returned if currently on loan). Again, this information is being retrieved by library-specific plugins/javascript code (at HTW for instance by calls to the ALMA system, at TU Wien to a 'userservices' system etc.). The PNX (primo normalized xml) records include RTA-links (real-time availability) but I can't get them to work. So again I guess I'll implement it with some additional parameters in the json config for HTW Dresden only and maybe later for other libraries.

Answer 6 · 2020-11-15T20:59:20.000Z

Yeah, makes sense to just start with one library then (and clearly separate the library-specific parts in the code). Sad that it's so different, but not entirely unexpected if Primo is used as a discovery frontend for different complex backends :/

Answer 7 · 2020-11-15T21:02:05.000Z

Yep. Maybe even start with a base class that implements the basic API without account login and detailed availability data for all libraries, and then a subclass that implements special cases for account and availability for HTW Dresden.

Answer 8 · 2020-11-16T21:35:43.000Z

@johan12345 : yes, this sounds good