mbennett-uoe/whiiif

Make Collection endpoint more flexible

Opened this issue · 0 comments

At the minute, there are a bunch of assumptions about the manifest:

  • Canvases are called either page_<x> or just <x> (where <x> is the index)
  • Pages in the ALTO file are called page_<x> (tesseract default - maybe other software too)
  • Therefore we can link ALTO page to the Canvas by reading the manifest, making a list of Canvases and using index referencing
  • Canvas URIs have a static relationship to the Manifest URL - see #16

If any of these are not true, Collection search won't work. Whatever solution to #16 we use, it should make this issue relatively easy to solve. Probably just need to change the SOLR response->Canvas lookup to be dict based instead of list based.