lgpma question
Closed this issue · 2 comments
Hi!
First of all, thanks for this wonderful open-source project, also congratulations for all the achievements!
The release model only contains structure-level result. You may use the text recognition module for the complete result.
I can use the bboxes
to OCR it into text, but how to match it with the table structure HTML?
Do you have any instructions on how to use some text recognition modules like RF-Learning
to extract the text and embed it into the HTML table structure?
The returned result of current demo contains results['bboxes'] and results['html']. You may crop the small images from original images accoding to the 'bboxes' results, and use the text recognition module by calling the apis of 'inference_model' and get the recognition result.
As for how map the recognition result into the html, you may refer to the postprocessing module of LGPMA that how bboxes are transferred into html. In L139 of post_lgpma.py, you can replace the 'text_tokens' (a list that has same length with bboxes) with recognition result.
Thank you so much for your detailed response. I think that solves all my questions for now.
Thanks again for this wonderful work!