hnesk/browse-ocrd

Integrate page-xml-draw

Closed this issue · 4 comments

hnesk commented

There is a new library that can generate an opencv image from Page-Xml.
It would be great to use it for visualizing page-xml in browse-ocrd
Ideas (from simple to complex):

  1. Use the generated image directly in an image view
  2. Wrap the image in HTML with an imagemap with actionable areas and display it via a Webkit view.
  3. Write a special view for interacting with the library, that renders shapely-"polygons" and tests for actionable areas

I would love to see 3 (i.e. fixing #15) as it is the most powerful option, going forward. It should still be based on a WebKit view IMO.

I don't think the polygon extraction and image drawing warrants adding an extra dependency (above ocrd itself and pillow), though. It could be done even more lightly here.

Roughly:

  1. PageViewer's color scheme
  2. RGBA conversion of raw image
  3. iterating through all (active) segments of the PcGts instance
  4. validating their coordinates
  5. drawing by alpha-compositing

I think 1 seems like a downgrade when compared to PAGE viewer, it would be either a static image view (no different than running a processor like ocrd-segment-extract-pages or ocrd-page-xml-draw - which I have plans to implement - and then opening the resultant fileGrp in browse-ocrd), or a "dynamic" view where the whole image would have to be redrawn in order to display the annotation set selected by the user.

I also agree with @bertsky on integrating page-xml-draw: browse-ocrd already imports ocrd, which contains a very similar PAGE-XML parser/traverser to the one I built page-xml-draw on top of (ocrd_models.ocrd_page_generateds - both ocrd and page-xml-draw use GenerateDS with user-defined methods). The drawing of the polygons itself is also really simple, and browse-ocrd already depends on opencv and pillow, so the same goal could be achieved without extra dependencies. The only advantage of an integration would be the customizability provided by page-xml-draw, allowing the user to describe how the PAGE-XML tree must be traversed, which visited element types should be drawn and with which fill color, stroke color, stroke thickness, opacity, etc. to be used, but I fear this is not adequate for a GUI.

I have been doing some experiments with HTML image maps to study how 3 could be implemented with the WebView. I came across this library and adapted the page-xml-draw code to export the polygons from PAGE-XML as HTML map areas, with a JS script using JQuery and the ImageMapster library to render areas of different annotation types with different colors. The results can be found here. It did not work with browse-ocrd out of the box, so I made some modifications to visualize the results. There is still a lot of room for improvement both in the process of generating these image maps and in displaying them with browse-ocrd, but it looks promising. Something like this could be implemented in browse-ocrd, and then displayed in a special HTML view with buttons connected to JS routines so the user could select which layout annotations to display, similarly to PAGE viewer.

2021-03-15-153621_1600x876_scrot

hnesk commented

I don't think the polygon extraction and image drawing warrants adding an extra dependency (above ocrd itself and pillow), though. It could be done even more lightly here.

Roughly:
...

Thanks for the hints, that's what I've done now in #30
Any testing/feedback is much appreciated.

hnesk commented

Fixed in #30