This is a SIH project on AI and OCR - To search Telugu & Urdu words in PDF present in Unicode as well as in image format. We got selected in SIH finals for this project.
techievivek/ai-and-ocr-data-extractor
Tesseractjs based text extractor from PDF and images.
JavaScript