This repository is intended to perform Optical Character Recognition in a Brazilian ID document.
In this project, it is assumed that the input image may not be in the best perspective in order for this task to be performed. Therefore, some steps must be taken to preprocess the image, before applying OCR.
- Image is read in gray-scale;
- Gaussian Blur is performed to remove any available noise;
- Adaptative Threshold is applied to blurred image;
- We find the contour whose area is the greatest, since it represents the document frame;
- With the contour found in the last step, we create a mask with the area represented by the frame;
- Using this mask, we are able to find the four corners of the ID document in the original image;
- Therefore, we apply dewarping and transform our perspective, so the four corners of the document are the same as the image.
Since we have an image with better resolution and we know the document's template, we apply Optical Character Recognition to it by using pytesseract API.
We also have a web application developed with Flask, so the user can upload an image of the document and then the text information contained within is displayed on the screen.