Finetuning for layout detection
Opened this issue · 3 comments
Hi @VikParuchuri ,
Great project, I have been using it and it works for almost every use case of mine.
However I am now having some very complex documents and I want to finetune the layout detection models for my data.
Would be great if you could provide some directions on the following:
- A data annotation tool which would output in the format needed by surya-ocr?
- Are there any finetuning instructions available in docs or guide?
Again, a very nice open source project 🙌🏻
@phamkhactu sure
# import required functions from surya
def layout(image):
line_predictions = batch_text_detection([image], det_model, det_processor)
layout_predictions = batch_layout_detection([image], layout_model, layout_processor, line_predictions)
return layout_predictions[0].bboxes
If you are aware of finetuning for layout detection models, please point me to resources (for data annotation and model training both)
Hi @sky-2002
I used config:
parser.add_argument("--images", action="store_true", help="Save images of detected layout bboxes.", default=True)
to save layout detected by model. I saw that, it drew bounding box for reach row, not a block.
Could you give me your image that you detected layout successfully? I want to check it, maybe I am wrong in runining code.