lgpma question

Question

lgpma question

Closed this issue 3 years ago · 2 comments

Hi!

First of all, thanks for this wonderful open-source project, also congratulations for all the achievements!

The release model only contains structure-level result. You may use the text recognition module for the complete result.

I can use the bboxes to OCR it into text, but how to match it with the table structure HTML?
Do you have any instructions on how to use some text recognition modules like RF-Learning to extract the text and embed it into the HTML table structure?

Answer 1 · 2021-11-08T02:17:07.000Z

The returned result of current demo contains results['bboxes'] and results['html']. You may crop the small images from original images accoding to the 'bboxes' results, and use the text recognition module by calling the apis of 'inference_model' and get the recognition result.

As for how map the recognition result into the html, you may refer to the postprocessing module of LGPMA that how bboxes are transferred into html. In L139 of post_lgpma.py, you can replace the 'text_tokens' (a list that has same length with bboxes) with recognition result.

Answer 2 · 2021-11-08T15:44:53.000Z

Thank you so much for your detailed response. I think that solves all my questions for now.
Thanks again for this wonderful work!