OCR: Remove noise from OCR text
Closed this issue · 0 comments
harish-ganesh commented
Extraction of data from slides using Tesseract.
To do:
- Remove noise from image for better OCR extraction.
- Preprocess text extracted from OCR(cleaning, spell checking etc)
- Formation of proper sentences that are suitable for summarization phase.