OCR has tendency to misread the '<' between first and middle names.

Question

OCR has tendency to misread the '<' between first and middle names.

Nimbus76 opened this issue 4 years ago · 3 comments

read_mrz()/tesseract tends to interpret the '<' between first and middle name as a 'K'

I have tried multiple scans of varying quality of several passports and this anomaly occurs more often than not. Sometimes, it also interprets the '<' as an "X".

Every other field has been reliable.

Answer 1 · 2022-04-19T15:57:51.000Z

Are you using the legacy mode with tesseract?

Answer 2 · 2022-10-29T10:17:00.000Z

Facing the same issue with names, is there any way to fix/improve this behavior?

Answer 3 · 2022-10-29T13:57:36.000Z

@RanaOsamaAsif Try both the legacy and new Tesseract models. In my experience the legacy model was more robust with respect to this particular issue.