OCR with colpali
lukiod opened this issue · 1 comments
lukiod commented
Is it possible for a model successfully extracts text from the image and returns the extracted text in a structured format (JSON or plain text) using colpali.
ManuelFay commented
Hello ! That 's kind of the opposite of the point of ColPali... But most VLMs nowadays can definitely do that, so you can combine colpali for retrieving the page you want and a VLM to do justtaht !