pelagios/recogito2

Map view failing to display loaded annotations

Closed this issue · 11 comments

We have a document for which the map view is temporarily public at https://recogito.viaeregiae.org:9000/document/sui5bsdm8zb8he/map

The annotations, tags, and places JSON files apparently all load OK, but the markers are not displayed.

There are some unusual characters in the annotations, so I tested them in other documents without any map problems.

Is there anything obviously wrong with what we're doing, and easily fixed?

Hi,

the document shows as containing no geo-located places (i.e. no places with coordinates). If you download the CSV, you'll also see that there are indeed no lat/lons for the place annotations. I can't seen any errors on the frontend-side. I.e. either there are really no coordinates attached to those places, or something is going wrong in the backend. Do you see any error log output on the server?

Lots! Many of them seem to be Internal Server Errors caused by ElasticSearches that include characters like [ and : (we use them a lot in transcriptions).

A java.lang.NumberFormatException error is caused by this match query: "query" : "Afon ~Adda".

The JSON-LD download request returns only an enigmatic (0,2) (of class scala.Tuple2$mcII$sp), and generates nothing new in the log.

The log file is 120MB, and I have no idea how far back I'd need to go to find the cause of the map problem. Refreshing the map view page does not generate any new errors. Reassuringly, though, the annotations are all linked to valid gazetteer uris from which the point data could be extracted. Is the system perhaps failing on querying the authority uri records from ElasticSearch?

Hm, yes the query errors from special characters are a known issue. (They break the individual search request, but don't affect the rest of the system.) Two things you could do are

  • check the CSV download for server errors
  • in the annotated document, open an annotation which you know is a place annotation, and see what the popup looks like. If the place is correctly in the gazetteer, the popup should show a map + marker. If anything is wrong with the place or the annotation, you should at least get some kind of error.

Also: can you point me to an annotated page in that document that contains a place annotation? I wasn't able to find one (looks like its >160 images).

Thanks very much for giving this your attention, Rainer.

  • I'll check the CSV [EDIT: looks fine].
  • All is fine with the popups showing a mapped point.

P.28: Glamorgan contains some annotations (https://recogito.viaeregiae.org:9000/document/sui5bsdm8zb8he/part/44/edit)

Ok, I can't see anything immediately that's going wrong. What I definitely know: there is a query built into the service layer (translated into an ElasticSearch query) that's supposed to return all distinct places on a document. In your case, that query returns an empty list. (It doesn't seem to cause an internal server error. Looks like it really just completes successfully, but not returning any places from the gazetteer.)

The best place to observe this is the map view you linked to above. (In the browser console, you can see the empty JSON response.) But also the downloads, where the query is fired into the index before the CSV/JSON-LD etc. response is constructed.

Did you by any chance make any updates to the gazetter index after that document was created? Add a new gazetter? Update an existing one? Or remove a gazetteer? I'm asking specifically because this might be some kind of data integrity issue. At the moment that's really just a guess though...

Ah yes, all sorts of gazetteer changes, including ones that might have affected a tagged entry.

I've been writing an ES utility to fetch linked gazetteer points for annotations, so when I've made some progress with that it might produce a list of now-missing gazetteer entries...

My guess is that the entries are definitely there. The query that populates the annotation popup gets a correct response. (It queries by gazetteer URI directly, and gets coordinate + names in return.) There's just something broken in the query that fetches all places for a document. It's one of the darker corners of Recogito's networked gazetteer index, but I think there are some special materialized connections (using ElasticSearch-specific parent/child relationships) that might have become detacted during a gazetteer update.

Thanks again, Rainer. I spent the afternoon learning a bit more about ElasticSearch, and wrote a utility to correct all the broken uri and union_id attributes. Looking good once more.

image

Great - good to hear it works. Did you figure out what the root cause was?

Yes, Rainer, as you suspected it was due to my having changed some authority uris when updating gazetteers. The union_ids are the obscure parent/child links that you mentioned.