WBSUBNdb_text is a Bangla text dataset containing 1383 offline handwritten text documents contributed by 190 writers. The dataset is composed of both simple and compound characters.
Extract all the .rar files to get the total database. Refer to these links: https://doi.org/10.1142/S0218001418560116 and https://doi.org/10.1016/j.eswa.2022.118498 of the original articles for more details about the dataset and the benchmark results. For further details, the following papers can be studied. Anyone using this dataset for academic/research (purely non-profit, non-commercial) purposes should cite the following papers while reporting their results on this dataset.
-
Chayan Halder, Sk Md Obaidullah, K. C. Santosh and Kaushik Roy, "Content independent writer identification on Bangla script: A document level approach", in International Journal of Artificial Intelligence & Pattern Recognition, Vol. 32, No. 09, pp. 1856011-1 -1856012-10, DOI: https://doi.org/10.1142/S0218001418560116, 2018.
-
Payel Rakshit, Chayan Halder, Sk Md Obaidullah, Kaushik Roy, "A Generalised Line Segmentation Method for Multi-Script Handwritten Text Documents", Expert Systems With Applications, Volume 212, 118498, DOI: https://doi.org/10.1016/j.eswa.2022.118498, February 2023.