/NER-Bangla-Dataset

Dataset for Bangla named entity recognition

MIT LicenseMIT

NER-Bangla-Dataset

Dataset Details

# Frequency
Sentences 71,284
Tokens 983,663
Unique Tokens 96,154
Tokenized Sentence Length [5-30]
Tagging Scheme IOB,BIOES

Citation

Please cite the following paper if you use our dataset in your research

  • Karim, Redwanul & Islam, M. A. & Simanto, Sazid & Chowdhury, Saif & Roy, Kalyan & Neon, Adnan & Hasan, Md & Firoze, Adnan & Rahman, Mohammad. (2019). A step towards information extraction: Named entity recognition in Bangla using deep learning. Journal of Intelligent & Fuzzy Systems. 37. 1-13. 10.3233/JIFS-179349.