MILE lab, IISc
Medical Intelligence and Language Engineering (MILE) Lab, Department of Electrical Engineering (EE), Indian Institute of Science (IISc)
Bangalore, India
Pinned Repositories
DegradedWordsKannada
Benchmarking dataset of degraded word images (with character splits) in Kannada along with their associated ground truth Unicode text
Kannada-OCR-test-images-with-ground-truth
This Kannada OCR benchmarking dataset contains 250 images, carefully chosen to have various kinds of recognition challenges. Some of the pages have italics and bold characters. Some of them have Halegannada poems and text; others are letterpress-printed pages, where the vowel modifiers appear as separate symbols and do not touch the consonants they go with. Some pages have interspersed English words; still others have tables with a lot of numeric data. In addition, there are old pages containing either a lot of broken characters or many words with two or more characters merged into a single connected component.
KonkaniDocumentsInKannadaScript
OCR dataset of Konkani documents printed using Kannada script along with groundtruth text
m2repo
MergedSymbolsKannada
Benchmarking dataset of merged symbols in Kannada along with their associated ground truth Unicode text
MILE-OCR-Engine
MILE-Transliterator
A browser plugin to Google Chrome, which instantly transliterates a website present in any Indic script to Kannada. This plugin exploits the Unicode block parallelism and also uses a rule-based approach to transliterate web pages to Kannada. This enables a polyglot user to read online documents in other Indic scripts through Kannada script. Currently, it supports transliteration from Tamil, Telugu, Malayalam, Bangla, Gujarati, Odiya, Punjabi, Sanskrit and Hindi pages. The quality of transliteration was scored by 45 users on a scale of 1 to 5 and a mean opinion score of 4.6 has been achieved.
ocr-web-app
OCR web-application
SanskritPagesUsingKannadaScript
OCR dataset of scanned images of Sanskrit text printed using Kannada script along with groundtruth text
TuluDocuments
OCR dataset of scanned pages of Tulu books along with groundtruth text
MILE lab, IISc's Repositories
MILE-IISc/ocr-web-app
OCR web-application
MILE-IISc/Kannada-OCR-test-images-with-ground-truth
This Kannada OCR benchmarking dataset contains 250 images, carefully chosen to have various kinds of recognition challenges. Some of the pages have italics and bold characters. Some of them have Halegannada poems and text; others are letterpress-printed pages, where the vowel modifiers appear as separate symbols and do not touch the consonants they go with. Some pages have interspersed English words; still others have tables with a lot of numeric data. In addition, there are old pages containing either a lot of broken characters or many words with two or more characters merged into a single connected component.
MILE-IISc/DegradedWordsKannada
Benchmarking dataset of degraded word images (with character splits) in Kannada along with their associated ground truth Unicode text
MILE-IISc/MergedSymbolsKannada
Benchmarking dataset of merged symbols in Kannada along with their associated ground truth Unicode text
MILE-IISc/MILE-OCR-Engine
MILE-IISc/SanskritPagesUsingKannadaScript
OCR dataset of scanned images of Sanskrit text printed using Kannada script along with groundtruth text
MILE-IISc/TuluDocuments
OCR dataset of scanned pages of Tulu books along with groundtruth text
MILE-IISc/KonkaniDocumentsInKannadaScript
OCR dataset of Konkani documents printed using Kannada script along with groundtruth text
MILE-IISc/m2repo
MILE-IISc/MILE-OCR-Model
MILE-IISc/MILE-Transliterator
A browser plugin to Google Chrome, which instantly transliterates a website present in any Indic script to Kannada. This plugin exploits the Unicode block parallelism and also uses a rule-based approach to transliterate web pages to Kannada. This enables a polyglot user to read online documents in other Indic scripts through Kannada script. Currently, it supports transliteration from Tamil, Telugu, Malayalam, Bangla, Gujarati, Odiya, Punjabi, Sanskrit and Hindi pages. The quality of transliteration was scored by 45 users on a scale of 1 to 5 and a mean opinion score of 4.6 has been achieved.
MILE-IISc/AndroidScannerDemo
ScanLibrary is an android document scanning library built on top of OpenCV, using the app you will be able to select the exact edges and crop the document accordingly from the selected 4 edges and change the perspective transformation of the cropped image.
MILE-IISc/angular-sample
MILE-IISc/CRNN
Convolutional recurrent neural network for scene text recognition or OCR in Keras
MILE-IISc/MILE-OCR-API
MILE-IISc/wikiclean
A Java Wikipedia markup to plain text converter