pelagios/recogito2

manifest urls not always correctly identified

nicolasfranck opened this issue · 2 comments

When I try to add a new document using a "IIIF manifest", I noticed that it makes the following assumptions:

  • the url ends with "/info.json": this is a info.json from the image api
  • the url ends with "/manifest" or "/manifest.json": this is a manifest.
  • the url ends with ".json": the url is fetched and the content is inspected to determine its type.

After the identification the url is refetched and parsed by the appropriate parser
as determined above.

Canvas labels also seem to be required (which I can understand).

So an url like https://adore.ugent.be/IIIF/manifests/archive.ugent.be%3A13119EC2-0C04-11E6-8A87-DC63F264C48D will never work.

Couldn't it be done like this?

  • fetch the url
  • identity the type by inspecting its content. Here you can restrict the @context and other things. Recogito2 only support presentation api v2, right?
  • parse json into structure
  • let appropriate class extract data from that structure (image api or presentation api). So need to refetch the data.

So in other words: type identification should never be done by looking at the structure of the url itself.

I wished I could contribute to this, but the knowledge of Scala is next to zero ;-(

Hi @nicolasfranck,

I agree this should be the way. When I wrote this back in the day, v2 was pretty much the only thing that existed (in the wild). Updating this to allow v3 presentation manifests is on the wishlist, but for now remains a matter of finding the available time. (Or a project that provides it... there are currently several smaller projects that contribute to funding, but they all have different emphasis - primarily re-use of individual Recogito components in different host environments, outside of the core Recogito "product".)

Ah, I did not imply to support v3 manifests, I only wanted to show that version 3 manifests slip through the current identification, and then fail during parse time, because their structure is different from what is silently expected (i.e. version 2).