MILE-IISc/Kannada-OCR-test-images-with-ground-truth
This Kannada OCR benchmarking dataset contains 250 images, carefully chosen to have various kinds of recognition challenges. Some of the pages have italics and bold characters. Some of them have Halegannada poems and text; others are letterpress-printed pages, where the vowel modifiers appear as separate symbols and do not touch the consonants they go with. Some pages have interspersed English words; still others have tables with a lot of numeric data. In addition, there are old pages containing either a lot of broken characters or many words with two or more characters merged into a single connected component.
Shell