opacapp/opacclient

SLUB API - pending items

Opened this issue · 11 comments

The API works (catalog and account features) but there are still some TODOs to be completed from #553:

  • get status: low priority due to the limited usefulness of the status as in many many cases it just says "click for more information about availability" (in the online web catalogue), so you'll have to open the details page anyway.
    Implementation idea: use asyncGet and then put all the CompletableFutures constructed into a list before calling get() on them (thanks to @johan12345 here and here)
    --> to be put off until #607 is implemented

  • references: libero links must be resolved to rsn entries in current SLUB catalog and they then in turn to id entries (similar to next below)
    Resolved by #613.

  • id of reserved items bc entries to be resolved to id entries by intercepting redirection (modify getResultById to take a prefixed id and resolve this id to an id entry if the prefix is not "id". Adopt getShareUrl, parseSearchResults, parseResultById and parseAccountData accordingly.)
    Resolved by #598.

  • id of lent items bc entries to be resolved to id entries by intercepting redirection (same as above)
    Resolved by #598.

  • actual ready date of stack request: works but could be refined, need more examples to examine

  • inter-library loans renewability: need example (X_is_flrenewable field exists in json but not sure if it's actually used/properly set)
    Resolved by #611.

  • getSupportedLanguages and setLanguage: low priority as all information concerning inter-library loans, most other account information and most of the captions/titles for links remain in German even in the English version of the online catalog

Regarding item 1 (get status)

I implemented it with completeable futures and I guess it's done correctly (also verifed by this small python script), but the result is prohibitively slow.

    internal fun parseSearchResults(json: JSONObject): SearchRequestResult {
        val availFutures = arrayListOf<CompletableFuture<Response>>()
        val searchresults = json.getJSONArray("docs").map<JSONObject, SearchResult> {
            SearchResult().apply {
                innerhtml = "<b>${it.getString("title")}</b><br>${it.getJSONArray("author").optString(0)}"
                it.getString("creationDate").run {
                    if (this != "null") {
                        innerhtml += "<br>(${this})"
                    }
                }
                type = mediaTypes[it.getJSONArray("format").optString(0)]
                        ?: SearchResult.MediaType.NONE
                id = "id/${it.getString("id")}".also {
                    availFutures.add(
                            asyncGet("$baseurl/$it/?type=1369315139&tx_find_find[data-format]=json-holding-status&tx_find_find[format]=data",
                                    true).whenComplete { response: Response?, _: Throwable? ->
                                if (response != null) {
                                    status = when (try {
                                        JSONObject(response.body()?.string()).getInt("status")
                                    } catch (e: Exception) {
                                        0
                                    }) {
                                        1 -> SearchResult.Status.GREEN
                                        2 -> SearchResult.Status.YELLOW
                                        3 -> SearchResult.Status.RED
                                        else -> SearchResult.Status.UNKNOWN
                                    }
                                } else  {
                                    status = SearchResult.Status.UNKNOWN
                                }
                            })
                }
            }
        }
        CompletableFuture.allOf(*availFutures.toTypedArray()).join()
        return SearchRequestResult(searchresults, json.getInt("numFound"), 1)
    }

Even with 1000 Mbps link speed it takes almost 10 seconds for the search results to show up. It seems that there's nothing we can do to speed this up as the web catalog needs roughly the same time as the app:

Web catalog: (20 status queries)
get_status_web_timings

App: (the search query itself followed by the 20 status queries)
get_status_app_timings

The queries appear to be blocked on the server side. For the web catalog, this looks OK as you see the search results after 1-2 seconds and then the pending status icons get updated to , or within the next 8 seconds (see example). The app, however, waits for all the status queries be completed and only then shows the results including status markers.

As far as I understand it's not possible in the app to update the status markers after the results are shown. If so I don't want to include the availability status in the app as it simply takes way too long to get the search results.

Regarding item 2 (references)

References are mostly links to other SLUB catalog entries, typically references to differently named preceding or following editions of periodical items or references to online/print editions of print/online items. Right now we just show the text without links as links would open the web catalog in the browser:

json.getJSONArray("references").forEach<JSONObject> {
// TODO: usually links to old SLUB catalogue, does it make sense to add the link?
addDetail(Detail(it.optString("text"), "${it.optString("name")} (${it.optString("target")})"))
}

Example: https://katalog.slub-dresden.de/id/0-130446319/#detail:
Web catalog: grafik

JSON from API:

"references": [
{
"text": "Vorgänger",
"link": "http://slubdd.de/katalog?libero_mab21364124",
"name": "August-Thyssen-Hütte: Thyssen-Forschung",
"target": "SLUB"
}

Display in app: grafik

These libero links can be resolved to rsn links and these in turn to id links, i.e. we could get a DetailedItem from them. Is there any means in the app to show this DetailedItem in a new detailed item view in the app? If not I'd remove this TODO.

Your implementation for the status looks correct to me.

As far as I understand it's not possible in the app to update the status markers after the results are shown. If so I don't want to include the availability status in the app as it simply takes way too long to get the search results.

Yes, unfortunately this is not possible yet. We have also thought about that, as it would also be helpful to speed up some of the other APIs as well. I have put some ideas regarding this into issue #607.

Is there any means in the app to show this DetailedItem in a new detailed item view in the app?

Well, it might be possible to put these references into the DetailedItem as volumes, so that they can link to other SLUB catalog pages. Do you think that would be helpful? In that case, the links of course need to be resolved into rsn links first, or getResultById might be adapted to also support slubdd.de links.

Well, it might be possible to put these references into the DetailedItem as volumes, so that they can link to other SLUB catalog pages.

Yes, this could be a workaround. The only drawback is that the references will appear under the subheading "Volumes" or "Enthaltene Werke" (from SearchResultDetailFragment.java I see that we can have both volumes and copies at the same time, which sounds a bit illogical in general but comes in handy in this case).
On the other hand it seems relatively easy to add another subheading "References" or "Verweise" to SearchResultDetailFragment, set up a ReferencesAdapter and add a referencedItems list to DetailedItem.
Maybe SLUB is (currently) the only API to use referenced items but if you find this a good idea I could make a PR for it. If not I'll use the volumes workaround (maybe we could rename the volumes subheading to "Volumes / References" or "Enthaltene Werke / Verweise" in this case). So it's up to you, I'm happy with both solutions.

Hm, both would solutions be fine for me. @raphaelm, any preference?

I'm not a huge fan on adding another list to DetailedItem, since it complicates both the API surface as well as the user interface implementation significantly.

Given the examples @StefRe showed, I think using Detail instances for this is not that wrong, Vorgänger: August-Thyssen-Hütte: Thyssen-Forschung as a detail doesn't look like a mis-use of the details for me. The only problem is that the link opens in the browser, not in the app, so maybe we should approach the problem there?

One possible solution would be to extend the Detail class with a new property, either a boolean specifying that the links in the HTML content are "internal", or new property for the link target stored outside of HTML. In either case, getDetailById would need to be able to take an URL as a parameter instead of an ID, or we add a getDetailByUrl.

Thinking this further, we could even add a boolean OpacApi.isDetailUrl(url) method that we run every time a user clicks any link, and if it returns true, we open the link as a result detail in-app. Extending that, an boolean OpacApi.isSearchUrl(url) could then be implemented to make e.g. linked author names or category names work, as can be seen in many OPACs.

Instead of adding interface complexity for something that (so far) only SLUB requires, we'd add a similar amount of complexity but get a positive effect almost everywhere.

@StefRe @johan12345 What are your thoughts on this approach?

Thinking this further, we could even add a boolean OpacApi.isDetailUrl(url) method that we run every time a user clicks any link, and if it returns true, we open the link as a result detail in-app. Extending that, an boolean OpacApi.isSearchUrl(url) could then be implemented to make e.g. linked author names or category names work, as can be seen in many OPACs.

Hm, indeed this sounds like a good and more general solution. It would also be possible to simply add an item ID as an additional property of the Detail class, but the abovementhoed approach has the additional advantage that it would work even if there are multiple links within one detail.

I think using Detail instances for this is not that wrong

yes, all (SLUB target) references are references to details, see 1st paragraph of this comment. In fact, SLUB terminology differentiates between links which refer to external resources and references which refer to SLUB catalog items.

One possible solution would be to extend the Detail class with a new property, either a boolean specifying that the links in the HTML content are "internal", or new property for the link target stored outside of HTML....
Thinking this further, we could even add a boolean OpacApi.isDetailUrl(url) method that we run every time a user clicks any link...

This sounds great but I guess it'll be a rather big change in the frontend code. So for the time being I'm going to use the volumes workaround. Maybe it makes sense to file a separate issue for this proposed new link handling to keep things clear.

Btw, I noticed that there's a boolean html property in Details that seems to be used nowhere.

I implemented getResultById for libero links (see above) by getting the final url from a redirecting HEAD request like httpHead("$id",false).request().url().toString(). This throws the following error:

java.net.ProtocolException: Too many follow-up requests: 21
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:176)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
    at okhttp3.RealCall.execute(RealCall.java:93)
    at de.geeksfactory.opacclient.apis.OkHttpBaseApi.httpHead(OkHttpBaseApi.java:294)
    at de.geeksfactory.opacclient.apis.SLUB.getResultById(SLUB.kt:180)
 ... etc.

The reason for this is that cleanUrl assumes the query component to be exclusively in the form of key=value pairs and adds an extra = after the key if no value is present:

} else {
params.add(new BasicNameValuePair(URLDecoder.decode(
kv[0], "UTF-8"), ""));
}

According to RFC3986 this is by no means obligatory. An example of a SLUB reference is http://slubdd.de/katalog?libero_mab21364775. It gets redirected as follows:

http://slubdd.de/katalog/?libero_mab21364775
http://primoproxybeta.slub-dresden.de/cgi-bin/permalink.pl?libero_mab21364775
http://katalogbeta.slub-dresden.de/rsn/1364775
https://katalogbeta.slub-dresden.de/rsn/1364775
https://katalog.slub-dresden.de/rsn/1364775
https://katalog.slub-dresden.de/rsn/1364775/
https://katalog.slub-dresden.de/id/0-130446319/

The additional = at the end leads to endless redirects at the second url.

@raphaelm

(EDIT: the following is no longer valid, see next comment below...
I think the best solution is to check between the following two lines whether the string pair contains any = and do the splitting into kv parts only if this is the case:
Do you see any principal issues with this approach (I'm just asking because test coverage is rather sparse so maybe you can identify potential issues by your experience)?)

Further I noticed that the test for cleanUrl

public void cleanUrlShouldHandleMultipleEqualsSigns() throws Exception {

is in ApacheBaseApiTest.java rather than in BaseApiTest.java (cleanUrl is a method of BaseApi and not ApacheBaseApi). I guess this is for historical reasons. Is it OK if I move this test to its proper class?

@raphaelm
Sorry to have bothered you with this - I found the bug: the second parameter must be null instead of "" here:

params.add(new BasicNameValuePair(URLDecoder.decode(
kv[0], "UTF-8"), ""));

I wrote a test for it and fixed it. So the only remaining question is if the tests for cleanUrl should be moved to a BaseApiTest class?

Ah, nice! Probably something we've just never encountered before.

I wrote a test for it and fixed it. So the only remaining question is if the tests for cleanUrl should be moved to a BaseApiTest class?

Sure, go ahead!