awesome-bangla
A collection of tools, datasets and resources on Bangla computing. This list was compiled to help researchers and hobbyists interested in Natural Language Processing with the Bangla (Bengali) language. Please feel free to contribute.
Typing Tools and Keyboards
End-User Products
- Avro Keyboard (Windows, Mac, Linux)
- Ridmik Keyboard (Android)
- OpenBangla Keyboard
- Rokeya Keyboard Layout
Libraries
- Avro Phonetic Library (JavaScript, Go)
- jQuery.IME - Supports Avro, Probhat, Inscript, National (BD)
- BengaliPhoneticParser.swift (OpenBangla)
Fixed and Phonetic Input specifications
Corpora (Corpus) and Datasets
- Corpus Builder (Aniruddha Adhikary et al, BanglaKit)
- Indian Language Part-of-Speech Tagset: Bengali (LDC2010T16)
- IARPA Babel Bengali Language Pack IARPA-babel103b-v0.4b (LDC2016S08)
- BanglaLekha Corpus (Handwriting) (ULAB, Dhaka)
- Bangla word-list (Bangla Akademy Banan Abhidhan) (SNLTR)
- Bangla Speech Corpus (IIT, Kharagpur)
- Bengali Stopwords List (stopwords-iso)
NLP Tools, Scripts and Utilities (also Projects)
NLP Tools
- Bangla POS Tagger (HMM/CRF/ME Based) (IIT, Kharagpur)
- Bangla POS Tagger (shm0007)
- Bangla POS Tagger (uzl)
- Bangla POS Tagger (XML Based) (sunkuet02)
- Morphological Analyzer (IIT, Kharagpur)
- Chunker (Rule Based) (IIT, Kharagpur)
- Chunker (Statistical) (IIT, Kharagpur)
- Bengali Dependency Parser (Rajarshi Das et al)
- Bengali Stemmer (Rule Based) (Debasis Ganguly)
- Bengali Stemmer (Rule Based) (.NET) (Tapas Nayak)
- Bengali Stemmer (Rule Based) (Java) (Tapas Nayak)
- Bengali Stemmer (PHP?) (Md. Tanveer Islam, Tanveer Ahmed Nayeem)
- Bengali Stemmer (JavaScript) (Rifat Nabi)
- Bengali Stemmer (Java) (2015) (Tazim Hoque)
- Bengali Stemmer (Java) (2017) (Sudipto Roy)
- Bengali Word Embedding (Md. Afjal Hossain)
- Bengali Wordnet (Soumen Ganguly)
- Bengali Sentiment Analysis (iPython Notebook) (Abhishek Singh)
- Keyword Extraction [Mahir]
Dictionary
- Bengali Lexical Dictionary (2012) (Abhishek Gupta)
- Bengali Dictionary (Minhas Kamal)
OCR
- Bangla OCR (kmhasan)
- Bangla OCR (CRBLP, BRACU)
- Bangla OCR (Fariha Nazmul)
- Bengali Handwritten OCR with Convolutional NN (Dibyatanoy Bhattacharjee)
- Bengali Digit Recognition (Abhinav Agarwalla)
- Bengali Digit Classification (Md. Afjal Hossain)
- BOCRA [R Package for Bengali OCR]
- Bengali OCR with CNN (Sanjiv)
TTS
- Katha - Bangla TTS (CRBLP, BRACU)
- Bengali-HTS (HMM-based Bangla TTS) (IIT, Kharagpur)
- Apona Pathok - Bangla TTS (Lost)
Others
- Bengali Spell Checking (Ankur)
- Bagha - Personal Assistant (Reyad Rahman)
- Avro Online
- Probhat Online
Programming Langauages (?)
- Koro (Go in Bangla)
- Potaka
- ChaScript (Syed Tanveer Jishan)